loudongfeng commented on pull request #925: URL: https://github.com/apache/parquet-mr/pull/925#issuecomment-907619845
This patch fixes both the write path and the read path. Write path: fix the currentDictionaryPageOffset reuse issue, then RowGroup.file_offset in parquet file will be correct. Read path: supporting read parquet file with wrong RowGroup.file_offset (by ignoring it) I'm not sure how the read path changes will inflect encryption files written by parquet 1.12.0. But encryption files‘s RowGroup.file_offset is wrongly setted already. Another solution for read path: only ignore file_offset when 1. file version is parquet 1.12.0 . 2. file is not encrypted -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org