ggershinsky commented on pull request #925:
URL: https://github.com/apache/parquet-mr/pull/925#issuecomment-916116455


   > it seems there is also a bug in parquet-cpp which causes incorrect file 
offset to be written, see https://issues.apache.org/jira/browse/SPARK-36696, so 
we'll want to make sure the solution here work for that case as well.
   
   Yep, it does. I've taken the file that was posted at that jira, and read it 
with Spark with p1.12.0 - this indeed fails. After adding this fix to parquet, 
the reading worked ok. This happens because for regular files (and most of 
encrypted files), this fix ignores the `RowGroup.offset` field, and reverts the 
offset compute to the pre-1.12 behavior. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to