[
https://issues.apache.org/jira/browse/LUCENE-3855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225085#comment-13225085
]
Dawid Weiss commented on LUCENE-3855:
-------------------------------------
Yep, there is something severely wrong in there, but I won't be able to figure
it out on my own. Don't get the logic in IndexWriter. But I tracked one of the
above exceptions to this scenario:
{noformat}
[junit] junit.framework.AssertionFailedError: info=_dm(4.0):cv6/4 isn't live
[junit] at
org.apache.lucene.index.IndexWriter$ReaderPool.infoIsLive(IndexWriter.java:663)
[junit] at
org.apache.lucene.index.IndexWriter$ReaderPool.dropAll(IndexWriter.java:717)
[junit] at
org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:1136)
[junit] at
org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1069)
[junit] at
org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1033)
[junit] at
org.apache.lucene.index.RandomIndexWriter.close(RandomIndexWriter.java:408)
[junit] at
org.apache.lucene.index.TestStressNRT.test(TestStressNRT.java:380)
{noformat}
So, the case here is that infoIsLive attempts to check:
{noformat}
int idx = segmentInfos.indexOf(info);
assert idx != -1: "info=" + info + " isn't live";
{noformat}
I added tracing to segmentInfos when segments do get removed from the
underlying array. Once executed, I get the listing:
{noformat}
...
>>> Removing: _o1(4.0):Cv13/13
>>> Removing: _o0(4.0):Cv6/6
>>> Removing: _nu(4.0):C12/12
>>> Removing: _nw(4.0):Cv12/12
>>> Removing: _o9(4.0):c2/2
>>> Removing: _q2(4.0):C12/12
>>> Removing: _qc(4.0):Cv7/7
>>> Removing: _q6(4.0):C12/12
>>> Not found: _d9(4.0):Cv8/3
{noformat}
But that last segment is never on the list of removed segments. It was never
added there in the first place. The allocation stack for that segment is:
{noformat}
at java.lang.Thread.getStackTrace(Thread.java:1436)
at org.apache.lucene.index.SegmentInfo.<init>(SegmentInfo.java:130)
at org.apache.lucene.index.IndexWriter._mergeInit(IndexWriter.java:3418)
at org.apache.lucene.index.IndexWriter.mergeInit(IndexWriter.java:3382)
at
org.apache.lucene.index.ConcurrentMergeScheduler.merge(ConcurrentMergeScheduler.java:346)
at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1905)
at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1899)
at
org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:2726)
at
org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2809)
at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2791)
at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2775)
at
org.apache.lucene.index.RandomIndexWriter.commit(RandomIndexWriter.java:313)
at org.apache.lucene.index.TestStressNRT$1.run(TestStressNRT.java:156)
{noformat}
So this looks like a race condition between closing the writer and a concurrent
merge scheduler?
Let me know if you need any further stacks/ tracing listings -- this is fairly
easy to reproduce on my machine and I can add anything.
> TestStressNRT failures (reproducible)
> -------------------------------------
>
> Key: LUCENE-3855
> URL: https://issues.apache.org/jira/browse/LUCENE-3855
> Project: Lucene - Java
> Issue Type: Bug
> Reporter: Dawid Weiss
> Priority: Minor
> Fix For: 4.0
>
> Attachments: output1.log, output2.log, output3.log, output4.log
>
>
> Build server logs. Reproduces on at least two machines.
> {noformat}
> [junit] ------------- Standard Error -----------------
> [junit] NOTE: reproduce with: ant test -Dtestcase=TestStressNRT
> -Dtestmethod=test
> -Dtests.seed=69468941c1bbf693:19e66d58475da929:69e9d2f81769b6d0
> -Dargs="-Dfile.encoding=UTF-8"
> [junit] NOTE: test params are: codec=Lucene3x,
> sim=RandomSimilarityProvider(queryNorm=true,coord=false): {}, locale=ro,
> timezone=Etc/GMT+1
> [junit] NOTE: all tests run in this JVM:
> [junit] [TestStressNRT]
> [junit] NOTE: Linux 3.0.0-16-generic amd64/Sun Microsystems Inc. 1.6.0_27
> (64-bit)/cpus=2,threads=1,free=74960064,total=135987200
> [junit] ------------- ---------------- ---------------
> [junit] Testcase: test(org.apache.lucene.index.TestStressNRT): Caused
> an ERROR
> [junit] MockDirectoryWrapper: cannot close: there are still open files:
> {_ng.cfs=8}
> [junit] java.lang.RuntimeException: MockDirectoryWrapper: cannot close:
> there are still open files: {_ng.cfs=8}
> [junit] at
> org.apache.lucene.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:555)
> [junit] at
> org.apache.lucene.index.TestStressNRT.test(TestStressNRT.java:385)
> [junit] at
> org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:743)
> [junit] at
> org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:639)
> [junit] at
> org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:22)
> [junit] at
> org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:538)
> [junit] at
> org.apache.lucene.util.LuceneTestCase$RememberThreadRule$1.evaluate(LuceneTestCase.java:600)
> [junit] at
> org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:164)
> [junit] at
> org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57)
> [junit] at
> org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:21)
> [junit] at
> org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:22)
> [junit] Caused by: java.lang.RuntimeException: unclosed IndexInput:
> _ng.cfs
> [junit] at
> org.apache.lucene.store.MockDirectoryWrapper.addFileHandle(MockDirectoryWrapper.java:479)
> [junit] at
> org.apache.lucene.store.MockDirectoryWrapper$1.openSlice(MockDirectoryWrapper.java:777)
> [junit] at
> org.apache.lucene.store.CompoundFileDirectory.openInput(CompoundFileDirectory.java:221)
> [junit] at
> org.apache.lucene.codecs.lucene3x.TermInfosReader.<init>(TermInfosReader.java:112)
> [junit] at
> org.apache.lucene.codecs.lucene3x.Lucene3xFields.<init>(Lucene3xFields.java:84)
> [junit] at
> org.apache.lucene.codecs.lucene3x.PreFlexRWPostingsFormat$1.<init>(PreFlexRWPostingsFormat.java:51)
> [junit] at
> org.apache.lucene.codecs.lucene3x.PreFlexRWPostingsFormat.fieldsProducer(PreFlexRWPostingsFormat.java:51)
> [junit] at
> org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentCoreReaders.java:108)
> [junit] at
> org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:51)
> [junit] at
> org.apache.lucene.index.IndexWriter$ReadersAndLiveDocs.getMergeReader(IndexWriter.java:521)
> [junit] at
> org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3587)
> [junit] at
> org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3257)
> [junit] at
> org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:382)
> [junit] at
> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:451)
> [junit]
> [junit]
> [junit] Test org.apache.lucene.index.TestStressNRT FAILED
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]