Hi Nik,

The trade-off is not easy indeed. First, the default terms dictionary can
already save some disk seeks. By storing the prefixes of the terms that are
in the terms dictionary in a FST in memory, it can avoid going to disk when
the term that you are looking up cannot match this FST. A bloom filter
might save a few additional disk seeks but as you said, it's pretty
intensive memory-wise and sometimes that is memory that would just be
better spent on the filesystem cache.



On Thu, Jul 17, 2014 at 4:25 PM, Nikolas Everett <[email protected]> wrote:

> Has anyone had success adding a bloom filter to the codec for any of their
> fields?
>
>
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-codec.html#bloom-postings
>
> I imagine it'd help reduce IO from (non multi-term) queries that
> frequently don't match.  Like if you have a field that is very specific and
> useful for searching but very rarely matches anything.
>
> It looks like the cost is in the range of 10 bits of heap per term per
> segment for a false positive probability around 1%.  Meaning it'd be pretty
> high if the index had lots of terms - especially if they were in many
> segments.  But it'd be about 10 bits per value if the values were mostly
> unique.
>
> Nik
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAPmjWd3X11bwogWi9oFTYFzzO6%2BdnvsOqcEFWG_dB5c%2Boy%3D4Fw%40mail.gmail.com
> <https://groups.google.com/d/msgid/elasticsearch/CAPmjWd3X11bwogWi9oFTYFzzO6%2BdnvsOqcEFWG_dB5c%2Boy%3D4Fw%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Adrien Grand

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j52TNTaN8NzNpB5jd-Kms3VuVtn_0ZFVqbt%2B7tfhk%3D1WQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to