Chao Sun created HIVE-15055:
-------------------------------

             Summary: Column pruning for nested fields in Parquet
                 Key: HIVE-15055
                 URL: https://issues.apache.org/jira/browse/HIVE-15055
             Project: Hive
          Issue Type: Improvement
          Components: Logical Optimizer, Physical Optimizer
            Reporter: Chao Sun
            Assignee: Chao Sun


Some columnar file formats such as Parquet store fields in struct type also 
column by column using encoding described in Google Dramel pager. It's very 
common in big data where data are stored in structs while queries only needs a 
subset of the the fields in the structs. However, presently Hive still needs to 
read the whole struct regardless whether all fields are selected. Therefore, 
pruning unwanted sub-fields in struct or nested fields at file reading time 
would be a big performance boost for such scenarios.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to