[ https://issues.apache.org/jira/browse/SPARK-20783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kazuaki Ishizaki updated SPARK-20783: ------------------------------------- Description: Current {{ColumnVector}} handles uncompressed data for parquet. For handling table cache, this JIRA entry adds {{OnHeapCachedBatch}} class to have compressed data. As first step of this implementation, this JIRA supports primitive data and string types. was: Current {{ColumnVector}} accepts only primitive-type Java array as an input for array. It is good to keep data from Parquet. On the other hand, in Spark internal, {{UnsafeArrayData}} is frequently used to represent array, map, and struct. To keep these data, this JIRA entry enhances {{ColumnVector}} to keep UnsafeArrayData. > Enhance ColumnVector to support compressed representation > --------------------------------------------------------- > > Key: SPARK-20783 > URL: https://issues.apache.org/jira/browse/SPARK-20783 > Project: Spark > Issue Type: Sub-task > Components: SQL > Affects Versions: 2.3.0 > Reporter: Kazuaki Ishizaki > > Current {{ColumnVector}} handles uncompressed data for parquet. > For handling table cache, this JIRA entry adds {{OnHeapCachedBatch}} class to > have compressed data. > As first step of this implementation, this JIRA supports primitive data and > string types. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org