GeorgKreuzmayr opened a new pull request, #44670:
URL: https://github.com/apache/arrow/pull/44670

   ### Rationale for this change
   
   The pyarrow documentation of ColumnMetaData is contradicting the C++ 
implementation.
   
   The pyarrow 
[documentation](https://arrow.apache.org/docs/dev/python/generated/pyarrow.parquet.ColumnChunkMetaData.html#pyarrow.parquet.ColumnChunkMetaData.data_page_offset)
 says:
   The data_page_offset and dictionary_page_offset are relative to the column 
chunk offset
   
   The C++ [comments in the 
code](https://github.com/apache/arrow/blob/df24a8225999896eb03db280354fbff42dfea0f5/cpp/src/generated/parquet_types.h#L2896)
 say:
   The offsets are byte offsets from the beginning of the file to first 
data_page / dictionary_page
   
   ### What changes are included in this PR?
   
   Update comments that `data_page_offset` and `dictionary_page_offset` are 
relative to start of file
   
   ### Are these changes tested?
   
   Verified locally that C++ code comments are correct
   
   ### Are there any user-facing changes?
   
   Documentation
   
   GitHub Issue: https://github.com/apache/arrow/issues/44668


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to