[ 
https://issues.apache.org/jira/browse/ARROW-7837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17038338#comment-17038338
 ] 

Eyal Farago commented on ARROW-7837:
------------------------------------

[~fan_li_ya], yes it is a real bug, we've hit this in production.
one way toy solve this might be calling fillHoles, another might be considering 
the lastIndexSet member, read the corresponding offset and use that to decide 
if the current write requires a resize of the values buffer.

btw, what do you mean by non-consecutive slots? in out use-case the code 
populating the vector simply skipped null slots (probably relying on fillHoles) 
- is this what you mean?

asking again,
will you accept a pull request with a fix?

> [Java] bug in BaseVariableWidthVector.copyFromSafe results with an index out 
> of bounds exception
> ------------------------------------------------------------------------------------------------
>
>                 Key: ARROW-7837
>                 URL: https://issues.apache.org/jira/browse/ARROW-7837
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Java
>    Affects Versions: 0.15.0
>            Reporter: Eyal Farago
>            Priority: Major
>
> There's a subtle bug in the copySafe method of BaseVariableWidthVector that 
> results with an index out of bounds exception.
> The issue is somewhere between the safeCopy and handleSafe methods,
> copySafe calls handleSafe in order to assure underlying buffers capacity 
> before appending a value to the vector, however the handleSafe method falsely 
> assumes all 'holes' have been field when checking the next write offset. as a 
> result it reads a stale offset (I believe it's 0 for freshly allocated 
> buffers but may be un-guaranteed when reusing a buffer) and fails to identify 
> the need to resize the values buffer.
>  
> the following (scala) test demonstrates the issue (by artificially shrinking 
> the values buffer). it was written after we've hit this in production:
> {code:java}
> test("try to reproduce Arrow issue"){
>     val charVector = new VarCharVector("stam", Allocator.get)
>     val srcCharVector = new VarCharVector("src", Allocator.get)
>     srcCharVector.setSafe(0, Array.tabulate(20)(_.toByte))
>     srcCharVector.setValueCount(2)
>     for( i <- 0 until 4){
>       charVector.copyFromSafe(0, i, srcCharVector)
>       charVector.setValueCount(i + 1)
>     }
>     val valBuff = charVector.getBuffers(false)(2)
>     valBuff.capacity(90)
>     charVector.copyFromSafe(0, 14, srcCharVector)
>     srcCharVector.close()
>     charVector.close()
>   }
> {code}
>  this test fails with the following exception:
>  
> {code:java}
> index: 80, length: 20 (expected: range(0, 90))
> java.lang.IndexOutOfBoundsException: index: 80, length: 20 (expected: 
> range(0, 90))
>       at io.netty.buffer.ArrowBuf.getBytes(ArrowBuf.java:929)
>       at 
> org.apache.arrow.vector.BaseVariableWidthVector.copyFromSafe(BaseVariableWidthVector.java:1345)
>       at 
> com.datorama.pluto.arrow.ArroStreamSerializationTest.$anonfun$new$33(ArroStreamSerializationTest.scala:454)
>       at 
> com.datorama.pluto.arrow.ArroStreamSerializationTest$$Lambda$129.00000000F78CFE20.apply$mcV$sp(Unknown
>  Source)
>       at 
> scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
>       at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
>       at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
>       at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
>       at org.scalatest.Transformer.apply(Transformer.scala:22)
>       at org.scalatest.Transformer.apply(Transformer.scala:20)
>       at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186)
>       at org.scalatest.TestSuite.withFixture(TestSuite.scala:196)
>       at org.scalatest.TestSuite.withFixture$(TestSuite.scala:195)
>       at org.scalatest.FunSuite.withFixture(FunSuite.scala:1560)
>       at 
> org.scalatest.FunSuiteLike.invokeWithFixture$1(FunSuiteLike.scala:184)
>       at org.scalatest.FunSuiteLike.$anonfun$runTest$1(FunSuiteLike.scala:196)
>       at 
> org.scalatest.FunSuiteLike$$Lambda$367.00000000001B9220.apply(Unknown Source)
>       at org.scalatest.SuperEngine.runTestImpl(Engine.scala:289)
>       at org.scalatest.FunSuiteLike.runTest(FunSuiteLike.scala:196)
>       at org.scalatest.FunSuiteLike.runTest$(FunSuiteLike.scala:178)
>       at 
> com.datorama.pluto.arrow.ArroStreamSerializationTest.org$scalatest$BeforeAndAfterEachTestData$$super$runTest(ArroStreamSerializationTest.scala:32)
>       at 
> org.scalatest.BeforeAndAfterEachTestData.runTest(BeforeAndAfterEachTestData.scala:194)
>       at 
> org.scalatest.BeforeAndAfterEachTestData.runTest$(BeforeAndAfterEachTestData.scala:187)
>       at 
> com.datorama.pluto.arrow.ArroStreamSerializationTest.runTest(ArroStreamSerializationTest.scala:32)
>       at 
> org.scalatest.FunSuiteLike.$anonfun$runTests$1(FunSuiteLike.scala:229)
>       at 
> org.scalatest.FunSuiteLike$$Lambda$358.000000001AAC0020.apply(Unknown Source)
>       at 
> org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:396)
>       at org.scalatest.SuperEngine$$Lambda$359.000000001AAC0820.apply(Unknown 
> Source)
>       at scala.collection.immutable.List.foreach(List.scala:388)
>       at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:384)
>       at org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:379)
>       at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:461)
>       at org.scalatest.FunSuiteLike.runTests(FunSuiteLike.scala:229)
>       at org.scalatest.FunSuiteLike.runTests$(FunSuiteLike.scala:228)
>       at org.scalatest.FunSuite.runTests(FunSuite.scala:1560)
>       at org.scalatest.Suite.run(Suite.scala:1147)
>       at org.scalatest.Suite.run$(Suite.scala:1129)
>       at 
> org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1560)
>       at org.scalatest.FunSuiteLike.$anonfun$run$1(FunSuiteLike.scala:233)
>       at 
> org.scalatest.FunSuiteLike$$Lambda$352.0000000019149C20.apply(Unknown Source)
>       at org.scalatest.SuperEngine.runImpl(Engine.scala:521)
>       at org.scalatest.FunSuiteLike.run(FunSuiteLike.scala:233)
>       at org.scalatest.FunSuiteLike.run$(FunSuiteLike.scala:232)
>       at 
> com.datorama.pluto.arrow.ArroStreamSerializationTest.org$scalatest$BeforeAndAfterAll$$super$run(ArroStreamSerializationTest.scala:32)
>       at 
> org.scalatest.BeforeAndAfterAll.liftedTree1$1(BeforeAndAfterAll.scala:213)
>       at org.scalatest.BeforeAndAfterAll.run(BeforeAndAfterAll.scala:210)
>       at org.scalatest.BeforeAndAfterAll.run$(BeforeAndAfterAll.scala:208)
>       at 
> com.datorama.pluto.arrow.ArroStreamSerializationTest.run(ArroStreamSerializationTest.scala:32)
>       at org.scalatest.tools.SuiteRunner.run(SuiteRunner.scala:45)
>       at 
> org.scalatest.tools.Runner$.$anonfun$doRunRunRunDaDoRunRun$13(Runner.scala:1346)
>       at 
> org.scalatest.tools.Runner$.$anonfun$doRunRunRunDaDoRunRun$13$adapted(Runner.scala:1340)
>       at 
> org.scalatest.tools.Runner$$$Lambda$164.0000000017957020.apply(Unknown Source)
>       at scala.collection.immutable.List.foreach(List.scala:388)
>       at org.scalatest.tools.Runner$.doRunRunRunDaDoRunRun(Runner.scala:1340)
>       at 
> org.scalatest.tools.Runner$.$anonfun$runOptionallyWithPassFailReporter$24(Runner.scala:1031)
>       at 
> org.scalatest.tools.Runner$.$anonfun$runOptionallyWithPassFailReporter$24$adapted(Runner.scala:1010)
>       at 
> org.scalatest.tools.Runner$$$Lambda$78.000000001B0D5820.apply(Unknown Source)
>       at 
> org.scalatest.tools.Runner$.withClassLoaderAndDispatchReporter(Runner.scala:1506)
>       at 
> org.scalatest.tools.Runner$.runOptionallyWithPassFailReporter(Runner.scala:1010)
>       at org.scalatest.tools.Runner$.run(Runner.scala:850)
>       at org.scalatest.tools.Runner.run(Runner.scala)
>       at 
> org.jetbrains.plugins.scala.testingSupport.scalaTest.ScalaTestRunner.runScalaTest2(ScalaTestRunner.java:133)
>       at 
> org.jetbrains.plugins.scala.testingSupport.scalaTest.ScalaTestRunner.main(ScalaTestRunner.java:27)
> {code}
> I believe the root cause for this bugs is in [this 
> line|https://github.com/apache/arrow/blob/apache-arrow-0.15.0/java/vector/src/main/java/org/apache/arrow/vector/BaseVariableWidthVector.java#L1237]
>  in the handleSafe method:
> {code:java}
> final int startOffset = getStartOffset(index);
> {code}
> we've encountered this bug in dremio's HashJoinOperator, where a loop has two 
> cases: in one case it appends to one vector and in the other case it appends 
> to another, when there are 'holes' in this loop it ends up calling copySafe 
> with an index which is several slots away from the last update, in most cases 
> this goes well but it occasionally (quite rare, but happens) misses the need 
> to resize the values buffer.
> will you be willing to accept a pull request fixing this issue?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to