[ 
https://issues.apache.org/jira/browse/LUCENE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Li updated LUCENE-847:
---------------------------

    Attachment: concurrentMerge.patch

Here is a patch for concurrent merge as discussed in:
http://www.gossamer-threads.com/lists/lucene/java-dev/45651?search_string=concurrent%20merge;#45651

I put it under this issue because it helps design and verify a factored merge 
policy which would provide good support for concurrent merge.

As described before, a merge thread is started when a writer is created and 
stopped when the writer is closed. The merge process consists of three steps: 
first, create a merge task/spec; then, carry out the actual merge; finally, 
"commit" the merged segment (replace segments it merged in segmentInfos), but 
only after appropriate deletes are applied. The first and last steps are fast 
and synchronous. The second step is where concurrency is achieved. Does it make 
sense to capture them as separate steps in the factored merge policy?

As discussed in 
http://www.gossamer-threads.com/lists/lucene/java-dev/45651?search_string=concurrent%20merge;#45651:
 documents can be buffered while segments are merged, but no more than 
maxBufferedDocs can be buffered at any time. So this version provides limited 
concurrency. The main goal is to achieve short ingestion hiccups, especially 
when the ingestion rate is low. After the factored merge policy, we could 
provide different versions of concurrent merge policies which provide different 
levels of concurrency. :-)

All unit tests pass. If IndexWriter is replaced with 
IndexWriterConcurrentMerge, all unit tests pass except the following:
  - TestAddIndexesNoOptimize and TestIndexWriter*
    This is because they check segment sizes expecting all merges are done. 
These tests pass if these checks are performed after the concurrent merges 
finish. The modified tests (with waits for concurrent merges to finish) are in 
TestIndexWriterConcurrentMerge*.
  - testExactFieldNames in TestBackwardCompatibility and 
testDeleteLeftoverFiles in TestIndexFileDeleter
    In both cases, file name segments_a is expected, but the actual is 
segments_7. This is because with concurrent merge, if compound file is used, 
only the compound version is "committed" (added to segmentInfos), not the 
non-compound version, thus the lower segments generation number.

Cheers,
Ning


> Factor merge policy out of IndexWriter
> --------------------------------------
>
>                 Key: LUCENE-847
>                 URL: https://issues.apache.org/jira/browse/LUCENE-847
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Steven Parkes
>         Assigned To: Steven Parkes
>         Attachments: concurrentMerge.patch, LUCENE-847.txt
>
>
> If we factor the merge policy out of IndexWriter, we can make it pluggable, 
> making it possible for apps to choose a custom merge policy and for easier 
> experimenting with merge policy variants.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to