[ https://issues.apache.org/jira/browse/LUCENE-9477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Simon Willnauer resolved LUCENE-9477. ------------------------------------- Fix Version/s: 8.7 master (9.0) Resolution: Fixed > IndexWriter might leave broken segments file behind on exception during > rollback > -------------------------------------------------------------------------------- > > Key: LUCENE-9477 > URL: https://issues.apache.org/jira/browse/LUCENE-9477 > Project: Lucene - Core > Issue Type: Bug > Reporter: Simon Willnauer > Priority: Major > Fix For: master (9.0), 8.7 > > Time Spent: 0.5h > Remaining Estimate: 0h > > Mike ran some beasty tests while I was working on LUCENE-8962. This test > caused some headaches since it only rarely also fails on master: > {noformat} > org.apache.lucene.index.TestIndexWriterOnVMError > testUnknownError FAILED > org.apache.lucene.index.CorruptIndexException: Unexpected file read error > while reading index. > (resource=BufferedChecksumIndexInput(MockIndexInputWrapper((clone of) > ByteBuffersIndexInput (file=pending_segments_2, buffers\ > =258 bytes, block size: 1, blocks: 1, position: 0)))) > at > __randomizedtesting.SeedInfo.seed([587A104EFE0C57E1:B32CCFCEFC8BC1D1]:0) > at > org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:300) > at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:521) > at org.apache.lucene.util.TestUtil.checkIndex(TestUtil.java:301) > at > org.apache.lucene.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:836) > at > org.apache.lucene.index.TestIndexWriterOnVMError.doTest(TestIndexWriterOnVMError.java:89) > at > org.apache.lucene.index.TestIndexWriterOnVMError.testUnknownError(TestIndexWriterOnVMError.java:251) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1754) > at > com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:942) > at > com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:978) > at > com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:992) > at > org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49) > at > org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) > at > org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) > at > org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) > at > org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) > at > com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) > at > com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:370) > at > com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:819) > at > com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:470) > at > com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:951) > at > com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:836) > at > com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:887) > at > com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:898) > at > org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) > at > com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) > at > org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41) > at > com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) > at > com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) > at > com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) > at > com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) > at > org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53) > at > org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) > at > org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) > at > org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54) > at > com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) > at > com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:370) > at > com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:826) > at java.base/java.lang.Thread.run(Thread.java:834) > Caused by: > java.io.FileNotFoundException: _0.si in > dir=ByteBuffersDirectory@1bae3fe1 > lockFactory=org.apache.lucene.store.SingleInstanceLockFactory@38275f41 > at > org.apache.lucene.store.MockDirectoryWrapper.openInput(MockDirectoryWrapper.java:748) > at > org.apache.lucene.store.Directory.openChecksumInput(Directory.java:157) > at > org.apache.lucene.store.MockDirectoryWrapper.openChecksumInput(MockDirectoryWrapper.java:1044) > at > org.apache.lucene.codecs.lucene86.Lucene86SegmentInfoFormat.read(Lucene86SegmentInfoFormat.java:91) > at > org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:364) > at > org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:298) > ... 41 more > .... > 2> NOTE: reproduce with: ant test -Dtestcase=TestIndexWriterOnVMError > -Dtests.method=testUnknownError -Dtests.seed=587A104EFE0C57E1 > -Dtests.nightly=true -Dtests.slow=true -Dtests.badapples=true > -Dtests.linedocsfile=/l/sim\ > on/lucene/test-framework/src/resources/org/apache/lucene/util/2000mb.txt.gz > -Dtests.locale=zh-CN -Dtests.timezone=SystemV/MST7MDT -Dtests.asserts=true > -Dtests.file.encoding=UTF-8 > 2> NOTE: leaving temporary files on disk at: > /l/simon/lucene/core/build/tmp/tests-tmp/lucene.index.TestIndexWriterOnVMError_587A104EFE0C57E1-003 > 2> NOTE: test params are: codec=Asserting(Lucene86): > {text_payloads=BlockTreeOrds(blocksize=128), > text_vectors=PostingsFormat(name=Asserting), > text1=PostingsFormat(name=Asserting), id=BlockTreeOrds(blocksize=128)}, > docValu\ > es:{dv3=DocValuesFormat(name=Lucene80), dv2=DocValuesFormat(name=Asserting), > dv5=DocValuesFormat(name=Lucene80), dv=DocValuesFormat(name=Asserting), > dv4=DocValuesFormat(name=Asserting)}, maxPointsInLeafNode=696, maxMBSortInH\ > eap=6.040673619645681, sim=Asserting(RandomSimilarity(queryNorm=false): > {text_payloads=IB SPL-DZ(0.3), text_vectors=DFR I(ne)L3(800.0), > text1=org.apache.lucene.search.similarities.BooleanSimilarity@6f4329a1}), > locale=zh-CN, \ > timezone=SystemV/MST7MDT > 2> NOTE: Linux 5.5.6-arch1-1 amd64/Oracle Corporation 11.0.6 > (64-bit)/cpus=128,threads=1,free=241525696,total=268435456 > 2> NOTE: All tests run in this JVM: [TestIndexWriterOnVMError] > {noformat} > The test reproduces on master also without the huge line docs file using this: > {noformat} > ant test -Dtestcase=TestIndexWriterOnVMError -Dtests.method=testUnknownError > -Dtests.seed=587A104EFE0C57E1 -Dtests.nightly=true -Dtests.slow=true > -Dtests.badapples=true -Dtests.locale=zh-CN -Dtests.timezone=SystemV/MST7MDT > -Dtests.asserts=true -Dtests.file.encoding=UTF-8 > {noformat} > the reason is that we fail to delete the already renamed pending segments > file when the metadata sync on the directory fails. The subsequent rollback > also crashes while it's trying to delete unrefed files and that will cause > subsequent CheckIndex calls to fail with FNF exceptions since the commit was > written but not fully removed. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org