loudongfeng commented on pull request #925:
URL: https://github.com/apache/parquet-mr/pull/925#issuecomment-909000698
FYI,Maybe we can make use of this information :
RowGroup[n].file_offset = RowGroup[n-1].file_offset +
RowGroup[n-1].total_compressed_size
total_compressed_size always
loudongfeng commented on pull request #925:
URL: https://github.com/apache/parquet-mr/pull/925#issuecomment-909000698
FYI,Maybe we can make use of this information :
RowGroup[n].file_offset = RowGroup[n-1].file_offset +
RowGroup[n-1].total_compressed_size
total_compressed_size always
loudongfeng commented on pull request #925:
URL: https://github.com/apache/parquet-mr/pull/925#issuecomment-907619845
This patch fixes both the write path and the read path.
Write path:
fix the currentDictionaryPageOffset reuse issue, then RowGroup.file_offset
in parquet file will be