[GitHub] [parquet-mr] loudongfeng commented on pull request #925: PARQUET-2078: Failed to read parquet file after writing with the same …

2021-08-31 Thread GitBox
loudongfeng commented on pull request #925: URL: https://github.com/apache/parquet-mr/pull/925#issuecomment-909000698 FYI,Maybe we can make use of this information : RowGroup[n].file_offset = RowGroup[n-1].file_offset + RowGroup[n-1].total_compressed_size total_compressed_size always

[GitHub] [parquet-mr] loudongfeng commented on pull request #925: PARQUET-2078: Failed to read parquet file after writing with the same …

2021-08-31 Thread GitBox
loudongfeng commented on pull request #925: URL: https://github.com/apache/parquet-mr/pull/925#issuecomment-909000698 FYI,Maybe we can make use of this information : RowGroup[n].file_offset = RowGroup[n-1].file_offset + RowGroup[n-1].total_compressed_size total_compressed_size always

[GitHub] [parquet-mr] loudongfeng commented on pull request #925: PARQUET-2078: Failed to read parquet file after writing with the same …

2021-08-28 Thread GitBox
loudongfeng commented on pull request #925: URL: https://github.com/apache/parquet-mr/pull/925#issuecomment-907619845 This patch fixes both the write path and the read path. Write path: fix the currentDictionaryPageOffset reuse issue, then RowGroup.file_offset in parquet file will be