[ 
https://issues.apache.org/jira/browse/LUCENE-3855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226393#comment-13226393
 ] 

Michael McCandless commented on LUCENE-3855:
--------------------------------------------

bq. Mike, would it help if we dumped a linear sequence of each thread's ops on 
indexwriter/ segmentinfos, whatever else?

Thanks Dawid!

I actually know the root cause here:
{noformat}
Uncaught exception by thread: Thread[Lucene Merge Thread #72,6,main]
org.apache.lucene.index.MergePolicy$MergeException: java.lang.AssertionError
        at 
org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:509)
        at 
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:480)
Caused by: java.lang.AssertionError
        at 
org.apache.lucene.index.IndexWriter.commitMergedDeletes(IndexWriter.java:3028)
        at 
org.apache.lucene.index.IndexWriter.commitMerge(IndexWriter.java:3137)
        at 
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3718)
        at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3257)
        at 
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:382)
        at 
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:451)
{noformat}

After that assert trips all kinds of crazy other exceptions can happen 
(not-closed files, not-live SegmentInfo, etc.).

After the merge finishes, which can take a long time, in commitMergedDeletes we 
revisit each segment so we can "carry forward" any new deletions recorded 
against that segment, to the newly merged segment.  In an active NRT app there 
can be many deletes to carry forward...

That tripped assert was to verify the ReadersAndLiveDocs (RLD) was still 
present in IW's ReaderPool; it's supposed to remain present throughout merging 
because we had incRef'd the SegmentReader we opened for merging.

But, it can in fact be dropped (the bug here) by another thread opening a 
reader and applying deletes and decRef'ing the reader all after the merge 
thread 1) acquired the RLD but 2) before it opened the mergeReader from it.

I (accidentally!!) caused this with LUCENE-3631, where we moved writeable 
deletes from SegmentReader into IndexWriter.  I suspect we need to add a 
separate refCount to RLD to fix this... I'm working on that.
                
> TestStressNRT failures (reproducible)
> -------------------------------------
>
>                 Key: LUCENE-3855
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3855
>             Project: Lucene - Java
>          Issue Type: Bug
>            Reporter: Dawid Weiss
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: 
> hoss-r1298470-fixed-seed__TEST-org.apache.lucene.index.TestStressNRT.xml, 
> output1.log, output2.log, output3.log, output4.log
>
>
> Build server logs. Reproduces on at least two machines.
> {noformat}
>     [junit] ------------- Standard Error -----------------
>     [junit] NOTE: reproduce with: ant test -Dtestcase=TestStressNRT 
> -Dtestmethod=test 
> -Dtests.seed=69468941c1bbf693:19e66d58475da929:69e9d2f81769b6d0 
> -Dargs="-Dfile.encoding=UTF-8"
>     [junit] NOTE: test params are: codec=Lucene3x, 
> sim=RandomSimilarityProvider(queryNorm=true,coord=false): {}, locale=ro, 
> timezone=Etc/GMT+1
>     [junit] NOTE: all tests run in this JVM:
>     [junit] [TestStressNRT]
>     [junit] NOTE: Linux 3.0.0-16-generic amd64/Sun Microsystems Inc. 1.6.0_27 
> (64-bit)/cpus=2,threads=1,free=74960064,total=135987200
>     [junit] ------------- ---------------- ---------------
>     [junit] Testcase: test(org.apache.lucene.index.TestStressNRT):    Caused 
> an ERROR
>     [junit] MockDirectoryWrapper: cannot close: there are still open files: 
> {_ng.cfs=8}
>     [junit] java.lang.RuntimeException: MockDirectoryWrapper: cannot close: 
> there are still open files: {_ng.cfs=8}
>     [junit]   at 
> org.apache.lucene.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:555)
>     [junit]   at 
> org.apache.lucene.index.TestStressNRT.test(TestStressNRT.java:385)
>     [junit]   at 
> org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:743)
>     [junit]   at 
> org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:639)
>     [junit]   at 
> org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:22)
>     [junit]   at 
> org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:538)
>     [junit]   at 
> org.apache.lucene.util.LuceneTestCase$RememberThreadRule$1.evaluate(LuceneTestCase.java:600)
>     [junit]   at 
> org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:164)
>     [junit]   at 
> org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57)
>     [junit]   at 
> org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:21)
>     [junit]   at 
> org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:22)
>     [junit] Caused by: java.lang.RuntimeException: unclosed IndexInput: 
> _ng.cfs
>     [junit]   at 
> org.apache.lucene.store.MockDirectoryWrapper.addFileHandle(MockDirectoryWrapper.java:479)
>     [junit]   at 
> org.apache.lucene.store.MockDirectoryWrapper$1.openSlice(MockDirectoryWrapper.java:777)
>     [junit]   at 
> org.apache.lucene.store.CompoundFileDirectory.openInput(CompoundFileDirectory.java:221)
>     [junit]   at 
> org.apache.lucene.codecs.lucene3x.TermInfosReader.<init>(TermInfosReader.java:112)
>     [junit]   at 
> org.apache.lucene.codecs.lucene3x.Lucene3xFields.<init>(Lucene3xFields.java:84)
>     [junit]   at 
> org.apache.lucene.codecs.lucene3x.PreFlexRWPostingsFormat$1.<init>(PreFlexRWPostingsFormat.java:51)
>     [junit]   at 
> org.apache.lucene.codecs.lucene3x.PreFlexRWPostingsFormat.fieldsProducer(PreFlexRWPostingsFormat.java:51)
>     [junit]   at 
> org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentCoreReaders.java:108)
>     [junit]   at 
> org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:51)
>     [junit]   at 
> org.apache.lucene.index.IndexWriter$ReadersAndLiveDocs.getMergeReader(IndexWriter.java:521)
>     [junit]   at 
> org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3587)
>     [junit]   at 
> org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3257)
>     [junit]   at 
> org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:382)
>     [junit]   at 
> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:451)
>     [junit] 
>     [junit] 
>     [junit] Test org.apache.lucene.index.TestStressNRT FAILED
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to