Eyal Farago created ARROW-7837:
----------------------------------
Summary: bug in BaseVariableWidthVector.copyFromSafe results with
an index out of bounds exception
Key: ARROW-7837
URL: https://issues.apache.org/jira/browse/ARROW-7837
Project: Apache Arrow
Issue Type: Improvement
Components: Java
Affects Versions: 0.15.0
Reporter: Eyal Farago
There's a subtle bug in the copySafe method of BaseVariableWidthVector that
results with an index out of bounds exception.
The issue is somewhere between the safeCopy and handleSafe methods,
copySafe calls handleSafe in order to assure underlying buffers capacity before
appending a value to the vector, however the handleSafe method falsely assumes
all 'holes' have been field when checking the next write offset. as a result it
reads a stale offset (I believe it's 0 for freshly allocated buffers but may be
un-guaranteed when reusing a buffer) and fails to identify the need to resize
the values buffer.
the following (scala) test demonstrates the issue (by artificially shrinking
the values buffer). it was written after we've hit this in production:
{code:java}
test("try to reproduce Arrow issue"){
val charVector = new VarCharVector("stam", Allocator.get)
val srcCharVector = new VarCharVector("src", Allocator.get)
srcCharVector.setSafe(0, Array.tabulate(20)(_.toByte))
srcCharVector.setValueCount(2)
for( i <- 0 until 4){
charVector.copyFromSafe(0, i, srcCharVector)
charVector.setValueCount(i + 1)
}
val valBuff = charVector.getBuffers(false)(2)
valBuff.capacity(90)
charVector.copyFromSafe(0, 14, srcCharVector)
srcCharVector.close()
charVector.close()
}
{code}
this test fails with the following exception:
{code:java}
index: 80, length: 20 (expected: range(0, 90))
java.lang.IndexOutOfBoundsException: index: 80, length: 20 (expected: range(0,
90))
at io.netty.buffer.ArrowBuf.getBytes(ArrowBuf.java:929)
at
org.apache.arrow.vector.BaseVariableWidthVector.copyFromSafe(BaseVariableWidthVector.java:1345)
at
com.datorama.pluto.arrow.ArroStreamSerializationTest.$anonfun$new$33(ArroStreamSerializationTest.scala:454)
at
com.datorama.pluto.arrow.ArroStreamSerializationTest$$Lambda$129.00000000F78CFE20.apply$mcV$sp(Unknown
Source)
at
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
at org.scalatest.Transformer.apply(Transformer.scala:22)
at org.scalatest.Transformer.apply(Transformer.scala:20)
at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186)
at org.scalatest.TestSuite.withFixture(TestSuite.scala:196)
at org.scalatest.TestSuite.withFixture$(TestSuite.scala:195)
at org.scalatest.FunSuite.withFixture(FunSuite.scala:1560)
at
org.scalatest.FunSuiteLike.invokeWithFixture$1(FunSuiteLike.scala:184)
at org.scalatest.FunSuiteLike.$anonfun$runTest$1(FunSuiteLike.scala:196)
at
org.scalatest.FunSuiteLike$$Lambda$367.00000000001B9220.apply(Unknown Source)
at org.scalatest.SuperEngine.runTestImpl(Engine.scala:289)
at org.scalatest.FunSuiteLike.runTest(FunSuiteLike.scala:196)
at org.scalatest.FunSuiteLike.runTest$(FunSuiteLike.scala:178)
at
com.datorama.pluto.arrow.ArroStreamSerializationTest.org$scalatest$BeforeAndAfterEachTestData$$super$runTest(ArroStreamSerializationTest.scala:32)
at
org.scalatest.BeforeAndAfterEachTestData.runTest(BeforeAndAfterEachTestData.scala:194)
at
org.scalatest.BeforeAndAfterEachTestData.runTest$(BeforeAndAfterEachTestData.scala:187)
at
com.datorama.pluto.arrow.ArroStreamSerializationTest.runTest(ArroStreamSerializationTest.scala:32)
at
org.scalatest.FunSuiteLike.$anonfun$runTests$1(FunSuiteLike.scala:229)
at
org.scalatest.FunSuiteLike$$Lambda$358.000000001AAC0020.apply(Unknown Source)
at
org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:396)
at org.scalatest.SuperEngine$$Lambda$359.000000001AAC0820.apply(Unknown
Source)
at scala.collection.immutable.List.foreach(List.scala:388)
at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:384)
at org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:379)
at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:461)
at org.scalatest.FunSuiteLike.runTests(FunSuiteLike.scala:229)
at org.scalatest.FunSuiteLike.runTests$(FunSuiteLike.scala:228)
at org.scalatest.FunSuite.runTests(FunSuite.scala:1560)
at org.scalatest.Suite.run(Suite.scala:1147)
at org.scalatest.Suite.run$(Suite.scala:1129)
at
org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1560)
at org.scalatest.FunSuiteLike.$anonfun$run$1(FunSuiteLike.scala:233)
at
org.scalatest.FunSuiteLike$$Lambda$352.0000000019149C20.apply(Unknown Source)
at org.scalatest.SuperEngine.runImpl(Engine.scala:521)
at org.scalatest.FunSuiteLike.run(FunSuiteLike.scala:233)
at org.scalatest.FunSuiteLike.run$(FunSuiteLike.scala:232)
at
com.datorama.pluto.arrow.ArroStreamSerializationTest.org$scalatest$BeforeAndAfterAll$$super$run(ArroStreamSerializationTest.scala:32)
at
org.scalatest.BeforeAndAfterAll.liftedTree1$1(BeforeAndAfterAll.scala:213)
at org.scalatest.BeforeAndAfterAll.run(BeforeAndAfterAll.scala:210)
at org.scalatest.BeforeAndAfterAll.run$(BeforeAndAfterAll.scala:208)
at
com.datorama.pluto.arrow.ArroStreamSerializationTest.run(ArroStreamSerializationTest.scala:32)
at org.scalatest.tools.SuiteRunner.run(SuiteRunner.scala:45)
at
org.scalatest.tools.Runner$.$anonfun$doRunRunRunDaDoRunRun$13(Runner.scala:1346)
at
org.scalatest.tools.Runner$.$anonfun$doRunRunRunDaDoRunRun$13$adapted(Runner.scala:1340)
at
org.scalatest.tools.Runner$$$Lambda$164.0000000017957020.apply(Unknown Source)
at scala.collection.immutable.List.foreach(List.scala:388)
at org.scalatest.tools.Runner$.doRunRunRunDaDoRunRun(Runner.scala:1340)
at
org.scalatest.tools.Runner$.$anonfun$runOptionallyWithPassFailReporter$24(Runner.scala:1031)
at
org.scalatest.tools.Runner$.$anonfun$runOptionallyWithPassFailReporter$24$adapted(Runner.scala:1010)
at
org.scalatest.tools.Runner$$$Lambda$78.000000001B0D5820.apply(Unknown Source)
at
org.scalatest.tools.Runner$.withClassLoaderAndDispatchReporter(Runner.scala:1506)
at
org.scalatest.tools.Runner$.runOptionallyWithPassFailReporter(Runner.scala:1010)
at org.scalatest.tools.Runner$.run(Runner.scala:850)
at org.scalatest.tools.Runner.run(Runner.scala)
at
org.jetbrains.plugins.scala.testingSupport.scalaTest.ScalaTestRunner.runScalaTest2(ScalaTestRunner.java:133)
at
org.jetbrains.plugins.scala.testingSupport.scalaTest.ScalaTestRunner.main(ScalaTestRunner.java:27)
{code}
I believe the root cause for this bugs is in [this
line|https://github.com/apache/arrow/blob/apache-arrow-0.15.0/java/vector/src/main/java/org/apache/arrow/vector/BaseVariableWidthVector.java#L1237]
in the handleSafe method:
{code:java}
final int startOffset = getStartOffset(index);
{code}
we've encountered this bug in dremio's HashJoinOperator, where a loop has two
cases: in one case it appends to one vector and in the other case it appends to
another, when there are 'holes' in this loop it ends up calling copySafe with
an index which is several slots away from the last update, in most cases this
goes well but it occasionally (quite rare, but happens) misses the need to
resize the values buffer.
will you be willing to accept a pull request fixing this issue?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)