[ 
https://issues.apache.org/jira/browse/SPARK-20783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kazuaki Ishizaki updated SPARK-20783:
-------------------------------------
    Description: 
Current {{ColumnVector}} handles uncompressed data for parquet.

For handling table cache, this JIRA entry adds {{OnHeapCachedBatch}} class to 
have compressed data.
As first step of this implementation, this JIRA supports primitive data and 
string types.


  was:
Current {{ColumnVector}} accepts only primitive-type Java array as an input for 
array. It is good to keep data from Parquet.

On the other hand, in Spark internal, {{UnsafeArrayData}} is frequently used to 
represent array, map, and struct. To keep these data, this JIRA entry enhances 
{{ColumnVector}} to keep UnsafeArrayData.


> Enhance ColumnVector to support compressed representation
> ---------------------------------------------------------
>
>                 Key: SPARK-20783
>                 URL: https://issues.apache.org/jira/browse/SPARK-20783
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 2.3.0
>            Reporter: Kazuaki Ishizaki
>
> Current {{ColumnVector}} handles uncompressed data for parquet.
> For handling table cache, this JIRA entry adds {{OnHeapCachedBatch}} class to 
> have compressed data.
> As first step of this implementation, this JIRA supports primitive data and 
> string types.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to