GitHub user ala opened a pull request:
https://github.com/apache/spark/pull/19308
[SPARK-22092] Reallocation in OffHeapColumnVector.reserveInternal corrupts
array data
## What changes were proposed in this pull request?
`OffHeapColumnVector.reserveInternal()` will only copy already inserted
values during reallocation if `data != null`. In vectors containing arrays this
is incorrect, since there field `data` is not used at all. We need to check
`lengthData` or `offsetData` instead.
## How was this patch tested?
Adds a new test to `ColumnVectorSuite` that reproduces the error.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/ala/spark vector-realloc
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/19308.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #19308
----
commit 8ce2a68b2039581797f8468759ee459d30a7bee4
Author: Ala Luszczak <[email protected]>
Date: 2017-09-21T16:42:38Z
SPARK-22092
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]