GitHub user cloud-fan opened a pull request:
https://github.com/apache/spark/pull/20395
[SPARK-23218][SQL] simplify ColumnVector.getArray
## What changes were proposed in this pull request?
`ColumnVector` is very flexible about how to implement array type. As a
result `ColumnVector` has 3 abstract methods for array type: `arrayData`,
`getArrayOffset`, `getArrayLength`. For example, in `WritableColumnVector` we
use the first child vector as the array data vector, and store offsets and
lengths in 2 arrays in the parent vector. `ArrowColumnVector` has a different
implementation.
This PR simplifies `ColumnVector` by using only one abstract for array
type: `getArray`.
## How was this patch tested?
existing tests.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/cloud-fan/spark vector
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/20395.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #20395
----
commit f3ca6c73e86928d0a087fbfd36de968ae873bbe3
Author: Wenchen Fan <wenchen@...>
Date: 2018-01-25T14:30:57Z
simplify ColumnVector.getArray
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]