On 8/16/06, Doron Cohen <[EMAIL PROTECTED]> wrote:
Under-merging would hurt search, unless optimize is called explicitly, but the index should "behave" without requiring the user to call optimize. 388 deals with this.
Depends on what you mean by "behave" :-) More segments than expected can cause failure because of file descriptor exhaustion. It's nice to have a calculable cap on the number of segments. It also depends on exactly what one thinks the index invariants should be w.r.t. mergeFactor.
Over-merging - in current flushRamSegments() code - would merge at most merge-factor documents prematurely.
Right.
Since merge-fatcor is usually not very large, this might be a minor issue - but still, if an index is growing by small doses, does it make sense to re-merge with the last disk segment each time the index is closed? Why not letting it be simply controlled by maybeMergeSegments?
I personally see mergeFactor as the maximum number of segments at any level in the index, with level defined by docsInSegment/maxBufferedDocs. maybeMergeSegments doesn't enforce this in the presence of partially filled segments because it counts documents and not segments. Since partially filled segments aren't written in a single IndexWriter session, this only needs to be checked for on a close(). -Yonik http://incubator.apache.org/solr Solr, the open-source Lucene search server --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
