GitHub user cloud-fan opened a pull request:

    https://github.com/apache/spark/pull/20395

    [SPARK-23218][SQL] simplify ColumnVector.getArray

    ## What changes were proposed in this pull request?
    
    `ColumnVector` is very flexible about how to implement array type. As a 
result `ColumnVector` has 3 abstract methods for array type: `arrayData`, 
`getArrayOffset`, `getArrayLength`. For example, in `WritableColumnVector` we 
use the first child vector as the array data vector, and store offsets and 
lengths in 2 arrays in the parent vector. `ArrowColumnVector` has a different 
implementation.
    
    This PR simplifies `ColumnVector` by using only one abstract for array 
type: `getArray`.
    
    ## How was this patch tested?
    
    existing tests.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cloud-fan/spark vector

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/20395.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20395
    
----
commit f3ca6c73e86928d0a087fbfd36de968ae873bbe3
Author: Wenchen Fan <wenchen@...>
Date:   2018-01-25T14:30:57Z

    simplify ColumnVector.getArray

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to