finnroblin opened a new pull request, #15553:
URL: https://github.com/apache/lucene/pull/15553

   ### Description
   
   Fixes MergedByteVectorValues behavior as described in #14992. Currently in 
`MergedByteVectorValues::nextDoc()` the `lastOrd` value is not incremented when 
the iterator is advanced. If `nextDoc` is called several times and a vector is 
loaded then the bad state of `lastOrd` causes an exception. One case where this 
occurs is In OpenSearch k-NN where we split a list of vectors into multiple 
parts to upload the partitions in parallel. (Please see 
https://github.com/opensearch-project/k-NN/issues/2803 for more details about 
the use case this bugfix solves).
   
   This PR includes a unit test that fails without the bugfix.
   
   Thanks @0ctopus13prime for the original RFC and bugfix! 
   
   Failed test output, pre-bug fix: (use the first commit in this PR to 
replicate. The second commit contains bugfix.)
   ```
   TestMergedByteVectorValues > testSkipThenLoadByteVectorDuringMerge FAILED
       java.lang.AssertionError: Test failed during merge
           at 
__randomizedtesting.SeedInfo.seed([CDF9B6BFD3014176:37C8D3B5F864CA9F]:0)
           at 
org.apache.lucene.codecs.TestMergedByteVectorValues.testSkipThenLoadByteVectorDuringMerge(TestMergedByteVectorValues.java:201)
           at 
java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
           at java.base/java.lang.reflect.Method.invoke(Method.java:565)
           at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1763)
           at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
           at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
           at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
           at 
org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
           at 
org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
           at 
org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
           at 
org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
           at 
org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
           at org.junit.rules.RunRules.evaluate(RunRules.java:20)
           at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
           at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
           at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
           at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
           at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
           at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
           at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
           at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
           at 
org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
           at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
           at 
org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
           at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
           at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
           at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
           at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
           at 
org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:52)
           at 
org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
           at 
org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
           at 
org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
           at 
org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
           at org.junit.rules.RunRules.evaluate(RunRules.java:20)
           at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
           at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
           at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:850)
           at java.base/java.lang.Thread.run(Thread.java:1474)
   
           Caused by:
           java.lang.IllegalStateException: only supports forward iteration: 
ord=3, lastOrd=-1
               at 
org.apache.lucene.codecs.KnnVectorsWriter$MergedVectorValues$MergedByteVectorValues.vectorValue(KnnVectorsWriter.java:418)
               at 
org.apache.lucene.codecs.TestMergedByteVectorValues$1$1.mergeOneField(TestMergedByteVectorValues.java:137)
               at 
org.apache.lucene.codecs.perfield.PerFieldKnnVectorsFormat$FieldsWriter.mergeOneField(PerFieldKnnVectorsFormat.java:128)
               at 
org.apache.lucene.codecs.KnnVectorsWriter.merge(KnnVectorsWriter.java:105)
               at 
org.apache.lucene.index.SegmentMerger.mergeVectorValues(SegmentMerger.java:272)
               at 
org.apache.lucene.index.SegmentMerger.mergeWithLogging(SegmentMerger.java:315)
               at 
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:159)
               at 
org.apache.lucene.index.IndexWriter.addIndexesReaderMerge(IndexWriter.java:3467)
               at 
org.apache.lucene.index.IndexWriter$AddIndexesMergeSource.merge(IndexWriter.java:3349)
               at 
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:661)
               at 
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:720)
   
   
   :lucene:core:test (FAILURE): 1 test, 1  failure
   
   1 test completed, 1 failed
   
   > Task :checkAnyTestIncludedAfterFiltering
   > Task :lucene:core:wipeTaskTemp UP-TO-DATE
   
   ERROR: 1 test has failed:
   
     - 
org.apache.lucene.codecs.TestMergedByteVectorValues.testSkipThenLoadByteVectorDuringMerge
 (:lucene:core)
       Test output: 
/Users/finnrobl/Documents/lucene/lucene/core/build/test-results/test/outputs/OUTPUT-org.apache.lucene.codecs.TestMergedByteVectorValues.txt
       Reproduce with: gradlew :lucene:core:test --tests 
"org.apache.lucene.codecs.TestMergedByteVectorValues.testSkipThenLoadByteVectorDuringMerge"
 -Ptests.asserts=true -Ptests.file.encoding=UTF-8 -Ptests.gui=false 
"-Ptests.jvmargs=-XX:TieredStopAtLevel=1 -XX:+UseParallelGC 
-XX:ActiveProcessorCount=1" -Ptests.jvms=6 -Ptests.seed=CDF9B6BFD3014176 
-Ptests.vectorsize=default
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to