[ 
https://issues.apache.org/jira/browse/ARROW-7837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eyal Farago updated ARROW-7837:
-------------------------------
    Issue Type: Bug  (was: Improvement)

> [Java] bug in BaseVariableWidthVector.copyFromSafe results with an index out 
> of bounds exception
> ------------------------------------------------------------------------------------------------
>
>                 Key: ARROW-7837
>                 URL: https://issues.apache.org/jira/browse/ARROW-7837
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Java
>    Affects Versions: 0.15.0
>            Reporter: Eyal Farago
>            Priority: Major
>
> There's a subtle bug in the copySafe method of BaseVariableWidthVector that 
> results with an index out of bounds exception.
> The issue is somewhere between the safeCopy and handleSafe methods,
> copySafe calls handleSafe in order to assure underlying buffers capacity 
> before appending a value to the vector, however the handleSafe method falsely 
> assumes all 'holes' have been field when checking the next write offset. as a 
> result it reads a stale offset (I believe it's 0 for freshly allocated 
> buffers but may be un-guaranteed when reusing a buffer) and fails to identify 
> the need to resize the values buffer.
>  
> the following (scala) test demonstrates the issue (by artificially shrinking 
> the values buffer). it was written after we've hit this in production:
> {code:java}
> test("try to reproduce Arrow issue"){
>     val charVector = new VarCharVector("stam", Allocator.get)
>     val srcCharVector = new VarCharVector("src", Allocator.get)
>     srcCharVector.setSafe(0, Array.tabulate(20)(_.toByte))
>     srcCharVector.setValueCount(2)
>     for( i <- 0 until 4){
>       charVector.copyFromSafe(0, i, srcCharVector)
>       charVector.setValueCount(i + 1)
>     }
>     val valBuff = charVector.getBuffers(false)(2)
>     valBuff.capacity(90)
>     charVector.copyFromSafe(0, 14, srcCharVector)
>     srcCharVector.close()
>     charVector.close()
>   }
> {code}
>  this test fails with the following exception:
>  
> {code:java}
> index: 80, length: 20 (expected: range(0, 90))
> java.lang.IndexOutOfBoundsException: index: 80, length: 20 (expected: 
> range(0, 90))
>       at io.netty.buffer.ArrowBuf.getBytes(ArrowBuf.java:929)
>       at 
> org.apache.arrow.vector.BaseVariableWidthVector.copyFromSafe(BaseVariableWidthVector.java:1345)
>       at 
> com.datorama.pluto.arrow.ArroStreamSerializationTest.$anonfun$new$33(ArroStreamSerializationTest.scala:454)
>       at 
> com.datorama.pluto.arrow.ArroStreamSerializationTest$$Lambda$129.00000000F78CFE20.apply$mcV$sp(Unknown
>  Source)
>       at 
> scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
>       at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
>       at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
>       at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
>       at org.scalatest.Transformer.apply(Transformer.scala:22)
>       at org.scalatest.Transformer.apply(Transformer.scala:20)
>       at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186)
>       at org.scalatest.TestSuite.withFixture(TestSuite.scala:196)
>       at org.scalatest.TestSuite.withFixture$(TestSuite.scala:195)
>       at org.scalatest.FunSuite.withFixture(FunSuite.scala:1560)
>       at 
> org.scalatest.FunSuiteLike.invokeWithFixture$1(FunSuiteLike.scala:184)
>       at org.scalatest.FunSuiteLike.$anonfun$runTest$1(FunSuiteLike.scala:196)
>       at 
> org.scalatest.FunSuiteLike$$Lambda$367.00000000001B9220.apply(Unknown Source)
>       at org.scalatest.SuperEngine.runTestImpl(Engine.scala:289)
>       at org.scalatest.FunSuiteLike.runTest(FunSuiteLike.scala:196)
>       at org.scalatest.FunSuiteLike.runTest$(FunSuiteLike.scala:178)
>       at 
> com.datorama.pluto.arrow.ArroStreamSerializationTest.org$scalatest$BeforeAndAfterEachTestData$$super$runTest(ArroStreamSerializationTest.scala:32)
>       at 
> org.scalatest.BeforeAndAfterEachTestData.runTest(BeforeAndAfterEachTestData.scala:194)
>       at 
> org.scalatest.BeforeAndAfterEachTestData.runTest$(BeforeAndAfterEachTestData.scala:187)
>       at 
> com.datorama.pluto.arrow.ArroStreamSerializationTest.runTest(ArroStreamSerializationTest.scala:32)
>       at 
> org.scalatest.FunSuiteLike.$anonfun$runTests$1(FunSuiteLike.scala:229)
>       at 
> org.scalatest.FunSuiteLike$$Lambda$358.000000001AAC0020.apply(Unknown Source)
>       at 
> org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:396)
>       at org.scalatest.SuperEngine$$Lambda$359.000000001AAC0820.apply(Unknown 
> Source)
>       at scala.collection.immutable.List.foreach(List.scala:388)
>       at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:384)
>       at org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:379)
>       at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:461)
>       at org.scalatest.FunSuiteLike.runTests(FunSuiteLike.scala:229)
>       at org.scalatest.FunSuiteLike.runTests$(FunSuiteLike.scala:228)
>       at org.scalatest.FunSuite.runTests(FunSuite.scala:1560)
>       at org.scalatest.Suite.run(Suite.scala:1147)
>       at org.scalatest.Suite.run$(Suite.scala:1129)
>       at 
> org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1560)
>       at org.scalatest.FunSuiteLike.$anonfun$run$1(FunSuiteLike.scala:233)
>       at 
> org.scalatest.FunSuiteLike$$Lambda$352.0000000019149C20.apply(Unknown Source)
>       at org.scalatest.SuperEngine.runImpl(Engine.scala:521)
>       at org.scalatest.FunSuiteLike.run(FunSuiteLike.scala:233)
>       at org.scalatest.FunSuiteLike.run$(FunSuiteLike.scala:232)
>       at 
> com.datorama.pluto.arrow.ArroStreamSerializationTest.org$scalatest$BeforeAndAfterAll$$super$run(ArroStreamSerializationTest.scala:32)
>       at 
> org.scalatest.BeforeAndAfterAll.liftedTree1$1(BeforeAndAfterAll.scala:213)
>       at org.scalatest.BeforeAndAfterAll.run(BeforeAndAfterAll.scala:210)
>       at org.scalatest.BeforeAndAfterAll.run$(BeforeAndAfterAll.scala:208)
>       at 
> com.datorama.pluto.arrow.ArroStreamSerializationTest.run(ArroStreamSerializationTest.scala:32)
>       at org.scalatest.tools.SuiteRunner.run(SuiteRunner.scala:45)
>       at 
> org.scalatest.tools.Runner$.$anonfun$doRunRunRunDaDoRunRun$13(Runner.scala:1346)
>       at 
> org.scalatest.tools.Runner$.$anonfun$doRunRunRunDaDoRunRun$13$adapted(Runner.scala:1340)
>       at 
> org.scalatest.tools.Runner$$$Lambda$164.0000000017957020.apply(Unknown Source)
>       at scala.collection.immutable.List.foreach(List.scala:388)
>       at org.scalatest.tools.Runner$.doRunRunRunDaDoRunRun(Runner.scala:1340)
>       at 
> org.scalatest.tools.Runner$.$anonfun$runOptionallyWithPassFailReporter$24(Runner.scala:1031)
>       at 
> org.scalatest.tools.Runner$.$anonfun$runOptionallyWithPassFailReporter$24$adapted(Runner.scala:1010)
>       at 
> org.scalatest.tools.Runner$$$Lambda$78.000000001B0D5820.apply(Unknown Source)
>       at 
> org.scalatest.tools.Runner$.withClassLoaderAndDispatchReporter(Runner.scala:1506)
>       at 
> org.scalatest.tools.Runner$.runOptionallyWithPassFailReporter(Runner.scala:1010)
>       at org.scalatest.tools.Runner$.run(Runner.scala:850)
>       at org.scalatest.tools.Runner.run(Runner.scala)
>       at 
> org.jetbrains.plugins.scala.testingSupport.scalaTest.ScalaTestRunner.runScalaTest2(ScalaTestRunner.java:133)
>       at 
> org.jetbrains.plugins.scala.testingSupport.scalaTest.ScalaTestRunner.main(ScalaTestRunner.java:27)
> {code}
> I believe the root cause for this bugs is in [this 
> line|https://github.com/apache/arrow/blob/apache-arrow-0.15.0/java/vector/src/main/java/org/apache/arrow/vector/BaseVariableWidthVector.java#L1237]
>  in the handleSafe method:
> {code:java}
> final int startOffset = getStartOffset(index);
> {code}
> we've encountered this bug in dremio's HashJoinOperator, where a loop has two 
> cases: in one case it appends to one vector and in the other case it appends 
> to another, when there are 'holes' in this loop it ends up calling copySafe 
> with an index which is several slots away from the last update, in most cases 
> this goes well but it occasionally (quite rare, but happens) misses the need 
> to resize the values buffer.
> will you be willing to accept a pull request fixing this issue?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to