On 8/16/06, Doron Cohen <[EMAIL PROTECTED]> wrote:
Under-merging would hurt search, unless optimize is called explicitly, but
the index should "behave" without requiring the user to call optimize. 388
deals with this.

Depends on what you mean by "behave" :-)
More segments than expected can cause failure because of file
descriptor exhaustion.  It's nice to have a calculable cap on the
number of segments. It also depends on exactly what one thinks the
index invariants should be w.r.t. mergeFactor.

Over-merging - in current flushRamSegments() code - would merge at most
merge-factor documents prematurely.

Right.

 Since merge-fatcor is usually not very
large, this might be a minor issue - but still, if an index is growing by
small doses, does it make sense to re-merge with the last disk segment each
time the index is closed? Why not letting it be simply controlled by
maybeMergeSegments?

I personally see mergeFactor as the maximum number of segments at any
level in the index, with level defined by
docsInSegment/maxBufferedDocs.

maybeMergeSegments doesn't enforce this in the presence of partially
filled segments because it counts documents and not segments.  Since
partially filled segments aren't written in a single IndexWriter
session, this only needs to be checked for on a close().

-Yonik
http://incubator.apache.org/solr Solr, the open-source Lucene search server

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to