Chao Sun created HIVE-15131:
-------------------------------
Summary: Change Parquet reader to read metadata on the task side
Key: HIVE-15131
URL: https://issues.apache.org/jira/browse/HIVE-15131
Project: Hive
Issue Type: Bug
Components: Reader
Reporter: Chao Sun
Assignee: Chao Sun
Currently the {{ParquetRecordReaderWrapper}} still uses the {{readFooter}} API
without filtering, which means it needs to read metadata about all row groups
every time. This could some issues when input dataset is particularly big and
has many columns.
[Parquet-84|https://issues.apache.org/jira/browse/PARQUET-84] introduced
another API which allows to do row group filtering on the task side. Hive
should adopt this API.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)