emkornfield commented on PR #197:
URL: https://github.com/apache/parquet-format/pull/197#issuecomment-1492748667

   > Do we want to include these statistics at both row group (column chunk) 
and page level? For the latter I am not sure it is the right approach. We 
implemented column indexes so one would not need to read the page header to get 
the related statistics. We even stopped writing `Statistics` into page headers 
in parquet-mr. If we only want these for the column chunk level then I would 
suggest having it under `ColumnMetaData` directly.
   
   @gszadovsky
   Is there an argument against flexibility here?  I believe parquet-cpp still 
writes page headers.  One argument for page headers is it allows readers better 
incremental estimates of memory needed as they progress (although it is 
possible taking an average size per cell at column chunk is sufficient here)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to