Thanks for replying. I've been looking to reduce my IO. Pushing everything into an all field is really going to be the biggest thing, I think, but I was wondering about the bloom filters. It doesn't sound worth it. It feels like everything but the default codec is pretty unlikely to be useful?
On Thu, Jul 17, 2014 at 4:31 PM, Adrien Grand < [email protected]> wrote: > Hi Nik, > > The trade-off is not easy indeed. First, the default terms dictionary can > already save some disk seeks. By storing the prefixes of the terms that are > in the terms dictionary in a FST in memory, it can avoid going to disk when > the term that you are looking up cannot match this FST. A bloom filter > might save a few additional disk seeks but as you said, it's pretty > intensive memory-wise and sometimes that is memory that would just be > better spent on the filesystem cache. > > > > On Thu, Jul 17, 2014 at 4:25 PM, Nikolas Everett <[email protected]> > wrote: > >> Has anyone had success adding a bloom filter to the codec for any of >> their fields? >> >> >> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-codec.html#bloom-postings >> >> I imagine it'd help reduce IO from (non multi-term) queries that >> frequently don't match. Like if you have a field that is very specific and >> useful for searching but very rarely matches anything. >> >> It looks like the cost is in the range of 10 bits of heap per term per >> segment for a false positive probability around 1%. Meaning it'd be pretty >> high if the index had lots of terms - especially if they were in many >> segments. But it'd be about 10 bits per value if the values were mostly >> unique. >> >> Nik >> >> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/CAPmjWd3X11bwogWi9oFTYFzzO6%2BdnvsOqcEFWG_dB5c%2Boy%3D4Fw%40mail.gmail.com >> <https://groups.google.com/d/msgid/elasticsearch/CAPmjWd3X11bwogWi9oFTYFzzO6%2BdnvsOqcEFWG_dB5c%2Boy%3D4Fw%40mail.gmail.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > > > -- > Adrien Grand > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j52TNTaN8NzNpB5jd-Kms3VuVtn_0ZFVqbt%2B7tfhk%3D1WQ%40mail.gmail.com > <https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j52TNTaN8NzNpB5jd-Kms3VuVtn_0ZFVqbt%2B7tfhk%3D1WQ%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd0dpqMAkLZ%3DOdWfhicO9hcB5ummBrnmTPw7xUG-54G1pQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
