loudongfeng commented on pull request #925:
URL: https://github.com/apache/parquet-mr/pull/925#issuecomment-907619845


   This patch fixes both the write path and the read path.
   Write path:
   fix the currentDictionaryPageOffset reuse issue, then RowGroup.file_offset 
in parquet file will be correct.
   Read path:
   supporting read parquet file with wrong RowGroup.file_offset (by ignoring it)
   I'm not sure how the read path changes will inflect encryption files written 
by parquet 1.12.0.
   But encryption files‘s RowGroup.file_offset is wrongly setted already.
   
   Another solution for read path:
   only ignore file_offset when
   1. file version is parquet 1.12.0 .
   2. file is not encrypted


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to