[ https://issues.apache.org/jira/browse/HIVE-14815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15512442#comment-15512442 ]
ASF GitHub Bot commented on HIVE-14815: --------------------------------------- GitHub user winningsix opened a pull request: https://github.com/apache/hive/pull/104 HIVE-14815: Support vectorization for Parquet This patch includes the following changes: 1. Implement a vectorized Page reader which support dictionary and RLE encoding. 2. Enable vectorization for Parquet input format. 3. Support several data types This is a WIP jira. You can merge this pull request into a Git repository by running: $ git pull https://github.com/winningsix/hive vectorization_parquet Alternatively you can review and apply these changes as the patch at: https://github.com/apache/hive/pull/104.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #104 ---- commit a38c766e09bc1c3728fa413767b9fbaa19a4b005 Author: Ferdinand Xu <cheng.a...@intel.com> Date: 2016-09-01T22:15:31Z HIVE-14815: Support vectorization for Parquet ---- > Support vectorization for Parquet > --------------------------------- > > Key: HIVE-14815 > URL: https://issues.apache.org/jira/browse/HIVE-14815 > Project: Hive > Issue Type: Bug > Reporter: Ferdinand Xu > Assignee: Ferdinand Xu > > Parquet doesn't provide a vectorized reader which can be used by Hive > directly. Also for Decimal Column batch, it consists of a batch of > HiveDecimal which is a Hive type which is unknown for Parquet. To support > Hive vectorization execution engine in Hive, we have to implement the > vectorized Parquet reader in Hive side. To limit the performance impacts, we > need to implement a page level vectorized reader. -- This message was sent by Atlassian JIRA (v6.3.4#6332)