[ https://issues.apache.org/jira/browse/LUCENE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ning Li updated LUCENE-847: --------------------------- Attachment: concurrentMerge.patch Here is a patch for concurrent merge as discussed in: http://www.gossamer-threads.com/lists/lucene/java-dev/45651?search_string=concurrent%20merge;#45651 I put it under this issue because it helps design and verify a factored merge policy which would provide good support for concurrent merge. As described before, a merge thread is started when a writer is created and stopped when the writer is closed. The merge process consists of three steps: first, create a merge task/spec; then, carry out the actual merge; finally, "commit" the merged segment (replace segments it merged in segmentInfos), but only after appropriate deletes are applied. The first and last steps are fast and synchronous. The second step is where concurrency is achieved. Does it make sense to capture them as separate steps in the factored merge policy? As discussed in http://www.gossamer-threads.com/lists/lucene/java-dev/45651?search_string=concurrent%20merge;#45651: documents can be buffered while segments are merged, but no more than maxBufferedDocs can be buffered at any time. So this version provides limited concurrency. The main goal is to achieve short ingestion hiccups, especially when the ingestion rate is low. After the factored merge policy, we could provide different versions of concurrent merge policies which provide different levels of concurrency. :-) All unit tests pass. If IndexWriter is replaced with IndexWriterConcurrentMerge, all unit tests pass except the following: - TestAddIndexesNoOptimize and TestIndexWriter* This is because they check segment sizes expecting all merges are done. These tests pass if these checks are performed after the concurrent merges finish. The modified tests (with waits for concurrent merges to finish) are in TestIndexWriterConcurrentMerge*. - testExactFieldNames in TestBackwardCompatibility and testDeleteLeftoverFiles in TestIndexFileDeleter In both cases, file name segments_a is expected, but the actual is segments_7. This is because with concurrent merge, if compound file is used, only the compound version is "committed" (added to segmentInfos), not the non-compound version, thus the lower segments generation number. Cheers, Ning > Factor merge policy out of IndexWriter > -------------------------------------- > > Key: LUCENE-847 > URL: https://issues.apache.org/jira/browse/LUCENE-847 > Project: Lucene - Java > Issue Type: Improvement > Reporter: Steven Parkes > Assigned To: Steven Parkes > Attachments: concurrentMerge.patch, LUCENE-847.txt > > > If we factor the merge policy out of IndexWriter, we can make it pluggable, > making it possible for apps to choose a custom merge policy and for easier > experimenting with merge policy variants. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]