finnroblin opened a new pull request, #15553: URL: https://github.com/apache/lucene/pull/15553
### Description Fixes MergedByteVectorValues behavior as described in #14992. Currently in `MergedByteVectorValues::nextDoc()` the `lastOrd` value is not incremented when the iterator is advanced. If `nextDoc` is called several times and a vector is loaded then the bad state of `lastOrd` causes an exception. One case where this occurs is In OpenSearch k-NN where we split a list of vectors into multiple parts to upload the partitions in parallel. (Please see https://github.com/opensearch-project/k-NN/issues/2803 for more details about the use case this bugfix solves). This PR includes a unit test that fails without the bugfix. Thanks @0ctopus13prime for the original RFC and bugfix! Failed test output, pre-bug fix: (use the first commit in this PR to replicate. The second commit contains bugfix.) ``` TestMergedByteVectorValues > testSkipThenLoadByteVectorDuringMerge FAILED java.lang.AssertionError: Test failed during merge at __randomizedtesting.SeedInfo.seed([CDF9B6BFD3014176:37C8D3B5F864CA9F]:0) at org.apache.lucene.codecs.TestMergedByteVectorValues.testSkipThenLoadByteVectorDuringMerge(TestMergedByteVectorValues.java:201) at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104) at java.base/java.lang.reflect.Method.invoke(Method.java:565) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1763) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946) at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982) at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996) at org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48) at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43) at org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45) at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60) at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902) at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:52) at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43) at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44) at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60) at org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390) at com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:850) at java.base/java.lang.Thread.run(Thread.java:1474) Caused by: java.lang.IllegalStateException: only supports forward iteration: ord=3, lastOrd=-1 at org.apache.lucene.codecs.KnnVectorsWriter$MergedVectorValues$MergedByteVectorValues.vectorValue(KnnVectorsWriter.java:418) at org.apache.lucene.codecs.TestMergedByteVectorValues$1$1.mergeOneField(TestMergedByteVectorValues.java:137) at org.apache.lucene.codecs.perfield.PerFieldKnnVectorsFormat$FieldsWriter.mergeOneField(PerFieldKnnVectorsFormat.java:128) at org.apache.lucene.codecs.KnnVectorsWriter.merge(KnnVectorsWriter.java:105) at org.apache.lucene.index.SegmentMerger.mergeVectorValues(SegmentMerger.java:272) at org.apache.lucene.index.SegmentMerger.mergeWithLogging(SegmentMerger.java:315) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:159) at org.apache.lucene.index.IndexWriter.addIndexesReaderMerge(IndexWriter.java:3467) at org.apache.lucene.index.IndexWriter$AddIndexesMergeSource.merge(IndexWriter.java:3349) at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:661) at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:720) :lucene:core:test (FAILURE): 1 test, 1 failure 1 test completed, 1 failed > Task :checkAnyTestIncludedAfterFiltering > Task :lucene:core:wipeTaskTemp UP-TO-DATE ERROR: 1 test has failed: - org.apache.lucene.codecs.TestMergedByteVectorValues.testSkipThenLoadByteVectorDuringMerge (:lucene:core) Test output: /Users/finnrobl/Documents/lucene/lucene/core/build/test-results/test/outputs/OUTPUT-org.apache.lucene.codecs.TestMergedByteVectorValues.txt Reproduce with: gradlew :lucene:core:test --tests "org.apache.lucene.codecs.TestMergedByteVectorValues.testSkipThenLoadByteVectorDuringMerge" -Ptests.asserts=true -Ptests.file.encoding=UTF-8 -Ptests.gui=false "-Ptests.jvmargs=-XX:TieredStopAtLevel=1 -XX:+UseParallelGC -XX:ActiveProcessorCount=1" -Ptests.jvms=6 -Ptests.seed=CDF9B6BFD3014176 -Ptests.vectorsize=default ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
