On 3/10/2016 4:06 PM, Steven White wrote:
> Last question on this topic (maybe), wouldn't a commit at the very end take
> too long on a 1 billion items?  Wouldn't a commit every, lets say 10,000
> items be more efficient?

The behavior that I have witnessed suggests that commit speed on a
well-tuned index depends more on the autowarm config than anything
else.  The total size of the index might make a difference, but I
suspect that the slow commit times I've seen on large shards are just
from the autowarming -- each warming query takes longer if the index is
large.

If you have the autoCommit config I recommended, the "last" commit
should be very fast, because those auto commits will flush data to disk
as you index, and the final manual commit should only need to deal with
data that has not yet been flushed.

More info than you wanted (TL;DR):  Even if you don't do the autoCommit,
you'll find that indexing tons of data without any commit at all *will*
cause older segments to be flushed to disk ... but the transaction logs
won't be rotated, and that's a whole separate problem.

Thanks,
Shawn

Reply via email to