FieldValueFilter and non-DocValues fields
Hello. After upgrade to 5.0.0 FieldValueFilter no longer works for fields that are not in DocValues. I have large indexes (around half a billion documents each) and I do not want to duplicate data too much. If I add some fields to DocValues each index will grow from 400GB to 1.3TB, with no apparent benefits, those fields are not used for faceting or sorting, only as “flags” in search (thought I have to return them to user as they are - integers). Can you please help me with two questions: 1. Is there any alternative to FieldValueFilter (I use NumericRangeFilter.newIntRange(fieldName, Integer.MIN_VALUE, Integer.MAX_VALUE, true, true) for now) to find documents with field present? 2. Can one use DocValues effectively instead of Stored Fields to show found documents? Or I should use UninvertingReader for fields that are not in DocValues? Thanks! -- Artem Redkin artemred...@yandex-team.ru - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
Lucene 5.0.0 - StringField and Sorting
Hi, looking at the JavaDoc of StringField it says: /** A field that is indexed but not tokenized: the entire * String value is indexed as a single token. For example * this might be used for a 'country' field or an 'id' * field, or any field that you intend to use for sorting * or access through the field cache. */ So i intend to use some StringFields for sorting. However trying to sort on them fails with: java.lang.IllegalStateException: unexpected docvalues type NONE for field 'NAME_KEYWORD' (expected=SORTED). Was indexed as StringField and Store.YES. So is the JavaDoc wrong here or is it correct and StringField should set: TYPE.setDocValuesType(DocValuesType.SORTED); so its would work? kind regards Torsten - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
Re: FieldValueFilter and non-DocValues fields
If you do not need sorting or faceting, doc values are not needed indeed. You can get back the old behaviour by using UninvertingReader (see LUCENE-5666 for more background). But like before this will load a lot of stuff into memory... Note that FieldValueFilter is very slow (with or without doc values). An alternative would be to index the field names of your documents and then use simple term queries against this field to filter those that have a value. On Fri, Feb 27, 2015 at 1:23 PM, Artem Redkin wrote: > Hello. > > After upgrade to 5.0.0 FieldValueFilter no longer works for fields that are > not in DocValues. I have large indexes (around half a billion documents each) > and I do not want to duplicate data too much. If I add some fields to > DocValues each index will grow from 400GB to 1.3TB, with no apparent > benefits, those fields are not used for faceting or sorting, only as “flags” > in search (thought I have to return them to user as they are - integers). > > Can you please help me with two questions: > 1. Is there any alternative to FieldValueFilter (I use > NumericRangeFilter.newIntRange(fieldName, Integer.MIN_VALUE, > Integer.MAX_VALUE, true, true) for now) to find documents with field present? > 2. Can one use DocValues effectively instead of Stored Fields to show found > documents? Or I should use UninvertingReader for fields that are not in > DocValues? > > Thanks! > > -- > Artem Redkin > artemred...@yandex-team.ru > > > - > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > -- Adrien - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org