ASF GitHub Bot commented on HIVE-14815:

GitHub user winningsix opened a pull request:


    HIVE-14815: Support vectorization for Parquet

    This patch includes the following changes:
    1. Implement a vectorized Page reader which support dictionary and RLE 
    2. Enable vectorization for Parquet input format.
    3. Support several data types
    This is a WIP jira.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/winningsix/hive vectorization_parquet

Alternatively you can review and apply these changes as the patch at:


To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #104
commit a38c766e09bc1c3728fa413767b9fbaa19a4b005
Author: Ferdinand Xu <cheng.a...@intel.com>
Date:   2016-09-01T22:15:31Z

    HIVE-14815: Support vectorization for Parquet


> Support vectorization for Parquet
> ---------------------------------
>                 Key: HIVE-14815
>                 URL: https://issues.apache.org/jira/browse/HIVE-14815
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Ferdinand Xu
>            Assignee: Ferdinand Xu
> Parquet doesn't provide a vectorized reader which can be used by Hive 
> directly. Also for Decimal Column batch, it consists of a batch of 
> HiveDecimal which is a Hive type which is unknown for Parquet. To support 
> Hive vectorization execution engine in Hive, we have to implement the 
> vectorized Parquet reader in Hive side. To limit the performance impacts, we 
> need to implement a page level vectorized reader.

This message was sent by Atlassian JIRA

Reply via email to