Hi Hyukjin,
I think the code you're looking for is created by parquet-generator so
we have one specific to each primitive type:
https://github.com/apache/parquet-mr/blob/master/parquet-generator/src/main/java/org/apache/parquet/filter2/IncrementallyUpdatedFilterPredicateGenerator.java
rb
On 09/16/2015 06:57 PM, Hyukjin Kwon wrote:
Hi all,
I am pretty new to Parquet and trying to learn Parquet structure.
I assume that min, max and etc information has been stored for both
ColumnMetaData and also DataPageHeader since 1.6.0 (
https://github.com/Parquet/parquet-mr/pull/338)
I see the statistics in ColumnMetaData is used to filter blocks (or row
groups) as filter2 at RowGroupFilter by calling canDrop().
I though the statistics in DataPageHeader is used to not to read a page by
reading the statistics.
However, my question is, I could not find where to use statistics in
DataPageHeader for filter1 and also filter2.
Could you give me some comments on this please?
--
Ryan Blue
Software Engineer
Cloudera, Inc.