[
https://issues.apache.org/jira/browse/HIVE-10252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dong Chen updated HIVE-10252:
-----------------------------
Parent Issue: HIVE-10666 (was: HIVE-8120)
> Make PPD work for Parquet in row group level
> --------------------------------------------
>
> Key: HIVE-10252
> URL: https://issues.apache.org/jira/browse/HIVE-10252
> Project: Hive
> Issue Type: Sub-task
> Reporter: Dong Chen
> Assignee: Dong Chen
> Fix For: 1.2.0
>
> Attachments: HIVE-10252.patch
>
>
> In Hive, predicate pushdown figures out the search condition in HQL,
> serialize it, and push to file format. ORC could use the predicate to filter
> stripes. Similarly, Parquet should use the statics saved in row group to
> filter not match row group. But it does not work.
> In {{ParquetRecordReaderWrapper}}, it get splits with all row groups (client
> side), and push the filter to Parquet for further processing (parquet side).
> But in {{ParquetRecordReader.initializeInternalReader()}}, if the splits
> have already been selected by client side, it will not handle filter again.
> We should make the behavior consistent in Hive. Maybe we could get splits,
> filter them, and then pass to parquet. This means using client side strategy.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)