FieldValueFilter and non-DocValues fields

2015-02-27 Thread Artem Redkin
Hello.

After upgrade to 5.0.0 FieldValueFilter no longer works for fields that are not 
in DocValues. I have large indexes (around half a billion documents each) and I 
do not want to duplicate data too much. If I add some fields to DocValues each 
index will grow from 400GB to 1.3TB, with no apparent benefits, those fields 
are not used for faceting or sorting, only as “flags” in search (thought I have 
to return them to user as they are - integers).

Can you please help me with two questions:
1. Is there any alternative to FieldValueFilter (I use 
NumericRangeFilter.newIntRange(fieldName, Integer.MIN_VALUE, Integer.MAX_VALUE, 
true, true) for now) to find documents with field present?
2. Can one use DocValues effectively instead of Stored Fields to show found 
documents? Or I should use UninvertingReader for fields that are not in 
DocValues?

Thanks!

-- 
Artem Redkin
artemred...@yandex-team.ru


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Lucene 5.0.0 - StringField and Sorting

2015-02-27 Thread Torsten Krah
Hi,

looking at the JavaDoc of StringField it says:

/** A field that is indexed but not tokenized: the entire
 *  String value is indexed as a single token.  For example
 *  this might be used for a 'country' field or an 'id'
 *  field, or any field that you intend to use for sorting
 *  or access through the field cache. */

So i intend to use some StringFields for sorting.
However trying to sort on them fails with:

java.lang.IllegalStateException: unexpected docvalues type NONE for
field 'NAME_KEYWORD' (expected=SORTED).

Was indexed as StringField and Store.YES.

So is the JavaDoc wrong here or is it correct and StringField should
set:

TYPE.setDocValuesType(DocValuesType.SORTED);

so its would work?

kind regards

Torsten




-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: FieldValueFilter and non-DocValues fields

2015-02-27 Thread Adrien Grand
If you do not need sorting or faceting, doc values are not needed indeed.

You can get back the old behaviour by using UninvertingReader (see
LUCENE-5666 for more background). But like before this will load a lot
of stuff into memory...

Note that FieldValueFilter is very slow (with or without doc values).
An alternative would be to index the field names of your documents and
then use simple term queries against this field to filter those that
have a value.

On Fri, Feb 27, 2015 at 1:23 PM, Artem Redkin
 wrote:
> Hello.
>
> After upgrade to 5.0.0 FieldValueFilter no longer works for fields that are 
> not in DocValues. I have large indexes (around half a billion documents each) 
> and I do not want to duplicate data too much. If I add some fields to 
> DocValues each index will grow from 400GB to 1.3TB, with no apparent 
> benefits, those fields are not used for faceting or sorting, only as “flags” 
> in search (thought I have to return them to user as they are - integers).
>
> Can you please help me with two questions:
> 1. Is there any alternative to FieldValueFilter (I use 
> NumericRangeFilter.newIntRange(fieldName, Integer.MIN_VALUE, 
> Integer.MAX_VALUE, true, true) for now) to find documents with field present?
> 2. Can one use DocValues effectively instead of Stored Fields to show found 
> documents? Or I should use UninvertingReader for fields that are not in 
> DocValues?
>
> Thanks!
>
> --
> Artem Redkin
> artemred...@yandex-team.ru
>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>



-- 
Adrien

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org