Repository: spark Updated Branches: refs/heads/master ee13f3e3d -> 718bbc939
[SPARK-22067][SQL] ArrowWriter should use position when setting UTF8String ByteBuffer ## What changes were proposed in this pull request? The ArrowWriter StringWriter was setting Arrow data using a position of 0 instead of the actual position in the ByteBuffer. This was currently working because of a bug ARROW-1443, and has been fixed as of Arrow 0.7.0. Testing with this version revealed the error in ArrowConvertersSuite test string conversion. ## How was this patch tested? Existing tests, manually verified working with Arrow 0.7.0 Author: Bryan Cutler <cutl...@gmail.com> Closes #19284 from BryanCutler/arrow-ArrowWriter-StringWriter-position-SPARK-22067. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/718bbc93 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/718bbc93 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/718bbc93 Branch: refs/heads/master Commit: 718bbc939037929ef5b8f4b4fe10aadfbab4408e Parents: ee13f3e Author: Bryan Cutler <cutl...@gmail.com> Authored: Wed Sep 20 10:51:00 2017 +0900 Committer: Takuya UESHIN <ues...@databricks.com> Committed: Wed Sep 20 10:51:00 2017 +0900 ---------------------------------------------------------------------- .../scala/org/apache/spark/sql/execution/arrow/ArrowWriter.scala | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/718bbc93/sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowWriter.scala ---------------------------------------------------------------------- diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowWriter.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowWriter.scala index 11ba04d..0b74073 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowWriter.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowWriter.scala @@ -234,8 +234,9 @@ private[arrow] class StringWriter(val valueVector: NullableVarCharVector) extend override def setValue(input: SpecializedGetters, ordinal: Int): Unit = { val utf8 = input.getUTF8String(ordinal) + val utf8ByteBuffer = utf8.getByteBuffer // todo: for off-heap UTF8String, how to pass in to arrow without copy? - valueMutator.setSafe(count, utf8.getByteBuffer, 0, utf8.numBytes()) + valueMutator.setSafe(count, utf8ByteBuffer, utf8ByteBuffer.position(), utf8.numBytes()) } } --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org