Currently IndexWriter.flushRamSegments() always merge all ram segments to
disk. Later it may merge more, depending on the maybe-merge algorithm. This
happens at closing the index and when the number of (1 doc) (ram) segments
exceeds max-buffered-docs.

Can there be a performance penalty for always merging to disk first?

Assume the following merges take place:
  merging segments _ram_0 (1 docs) _ram_1 (1 docs) ... _ram_N (1_docs) into
_a (N docs)
  merging segments _6 (M docs) _7 (K docs) _8 (L docs) into _b (N+M+K+L
docs)

Alternatively, we could tell (compute) that this is going to happen, and
have a single merge:
  merging segments _ram_0 (1 docs) _ram_1 (1 docs) ... _ram_N (1_docs)
                   _6 (M docs) _7 (K docs) _8 (L docs) into _b (N+M+K+L
docs)

This would save writing the segemnt of size N to disk and reading it again.
For large enough N, Is there really potential save here?


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to