Yes, index size vary between 10 and 50GB.

segments_per_tier is high because I allow many segments, I don't care for
optimal search performance (unless I run an optimize to a single segment
after bulk).

max_merge_at_once is low because I do not want many merge thread activity
during bulk.

I'm experimenting with Okapi BM25. It is better for many short fields in a
doc with short query terms. From what I can see it works better with
library catalog metadata than default Lucene ranking formula.

Jörg



On Fri, Feb 14, 2014 at 4:36 PM, Ivan Brusic <[email protected]> wrote:

> Jörg,
>
> So if your shards are at most 5GB and you have 3x2 shards, then your data,
> per index, is "only" 30GB?
>
> Don't know why, maybe because it is what I used in Lucene, but I always
> had segments_per_tier and max_merge_at_once to be the same value. I had
> them higher than the default of 10, but I slowly reduced them back to the
> default. How do you tune BM25? Hard to debug since you cannot change
> similarities on the fly.
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEPUYEZYruu%3DHOUJDG4Jo6%2BBb5_1o%3DoMowjXrCwdQ-rfw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to