Thanks, Adrien. That brings me closer.

So when the documentations say doc values do not support filtering, it's
talking about fielddata filtering for what's loaded into memory (anod not
filtering as part of a query... say term filter). For further clarification
- can a field that is not analyzed and only kept as doc values be used for
querying/filtering (say a term filter on a numeric field or match query on
a string field)? Or do all querying/filtering required the field to be in
the uninverted index?

What I'm trying to understand how we can optimize querying/filtering in a
large index (5 billion documents / 1 TB)? It's very hard to run a simple
term filter because a bitset filter will need to be calculated that
includes every single document. Wouldn't that utilize a lot of memory? Is
there a way to speed that up?



On Tue, Jul 15, 2014 at 6:30 AM, Adrien Grand <
[email protected]> wrote:

> Hi David,
>
> Doc values are a way to compute field data at indexing time, and to store
> it on disk. It can do everything that "uninverted" field data can do:
> aggregations, sorting, etc. However, it never kicks in automatically: it
> needs to be configured explicitely, and can only be set at index creation
> time, you cannot enable it afterwards.
>
> Regarding fielddata filtering, it is a way to trade accuracy for memory by
> only loading "important" terms into memory and doesn't work with doc values
> since it's not useful given that they are stored on disk anyway (and thus
> don't require much memory).
>
> Does it clarify?
>
>
>
>
>
> On Mon, Jul 14, 2014 at 7:26 PM, David K Smith <[email protected]>
> wrote:
>
>> When you map fields to use doc values for field data, does that limit the
>> functionality afforded to those fields to merely sorting and
>> aggregations/faceting?
>>
>> The documentation mentions that filtering is not supported by numeric or
>> string types when stored as doc values. Yikes, I thought that doc values is
>> intended for working with field data when it's too large to load into
>> memory. Is that not the case?
>>
>> I read both of the following pages but I'm not sure I quite understand
>> where the usefulness of field data fields kick in.
>>
>> http://www.elasticsearch.org/blog/disk-based-field-data-a-k-a-doc-values/
>>
>> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html
>>
>> Can someone please clarify?
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/FC9E6ECA-B869-4B40-B2C8-F55CE6AB6790%40gmail.com
>> <https://groups.google.com/d/msgid/elasticsearch/FC9E6ECA-B869-4B40-B2C8-F55CE6AB6790%40gmail.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> --
> Adrien Grand
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7a68-vDManD3C_TUXhB6jQxePhNcYx5VeFfku5AxuO2A%40mail.gmail.com
> <https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7a68-vDManD3C_TUXhB6jQxePhNcYx5VeFfku5AxuO2A%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKoSUN87jc2nt8H07M%2BBxQuUcKCQPtsxdSL9S1Nf0cFe17EBFA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to