Gabor Szadovszky created PARQUET-1765:
-----------------------------------------
Summary: Invalid filteredRowCount in InternalParquetRecordReader
Key: PARQUET-1765
URL: https://issues.apache.org/jira/browse/PARQUET-1765
Project: Parquet
Issue Type: Bug
Components: parquet-mr
Affects Versions: 1.11.0
Reporter: Gabor Szadovszky
Assignee: Gabor Szadovszky
Fix For: 1.11.1
The [record
count|https://github.com/apache/parquet-mr/blob/apache-parquet-1.11.0/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/InternalParquetRecordReader.java#L185]
is retrieved before setting the [projection
schema|https://github.com/apache/parquet-mr/blob/apache-parquet-1.11.0/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/InternalParquetRecordReader.java#L188]
so the value might be invalid if the projection impacts the filter.
In normal cases it does not cause any issue because the record filter will
filter correctly only that we are filtering the records one-by-one instead of
dropping the related pages.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)