This is an automated email from the ASF dual-hosted git repository.

maplefu pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/main by this push:
     new 32de498ca7 GH-44668: [Docs] Fix ColumnChunkMetaData offset 
documentation in pyarrow (#44670)
32de498ca7 is described below

commit 32de498ca7dba5861f22eee5e4527446f6218b7a
Author: GeorgKreuzmayr <[email protected]>
AuthorDate: Thu Nov 7 10:10:51 2024 +0100

    GH-44668: [Docs] Fix ColumnChunkMetaData offset documentation in pyarrow 
(#44670)
    
    ### Rationale for this change
    
    The pyarrow documentation of ColumnMetaData is contradicting the C++ 
implementation.
    
    The pyarrow 
[documentation](https://arrow.apache.org/docs/dev/python/generated/pyarrow.parquet.ColumnChunkMetaData.html#pyarrow.parquet.ColumnChunkMetaData.data_page_offset)
 says:
    The data_page_offset and dictionary_page_offset are relative to the column 
chunk offset
    
    The C++ [comments in the 
code](https://github.com/apache/arrow/blob/df24a8225999896eb03db280354fbff42dfea0f5/cpp/src/generated/parquet_types.h#L2896)
 say:
    The offsets are byte offsets from the beginning of the file to first 
data_page / dictionary_page
    
    ### What changes are included in this PR?
    
    Update comments that `data_page_offset` and `dictionary_page_offset` are 
relative to start of file
    
    ### Are these changes tested?
    
    Verified locally that C++ code comments are correct
    
    ### Are there any user-facing changes?
    
    Documentation
    
    GitHub Issue: https://github.com/apache/arrow/issues/44668
    * GitHub Issue: #44668
    
    Authored-by: Georg Kreuzmayr <[email protected]>
    Signed-off-by: mwish <[email protected]>
---
 python/pyarrow/_parquet.pyx | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/python/pyarrow/_parquet.pyx b/python/pyarrow/_parquet.pyx
index 254bfe3b09..a3abf1865b 100644
--- a/python/pyarrow/_parquet.pyx
+++ b/python/pyarrow/_parquet.pyx
@@ -467,7 +467,7 @@ cdef class ColumnChunkMetaData(_Weakrefable):
 
     @property
     def dictionary_page_offset(self):
-        """Offset of dictionary page relative to column chunk offset (int)."""
+        """Offset of dictionary page relative to beginning of the file 
(int)."""
         if self.has_dictionary_page:
             return self.metadata.dictionary_page_offset()
         else:
@@ -475,7 +475,7 @@ cdef class ColumnChunkMetaData(_Weakrefable):
 
     @property
     def data_page_offset(self):
-        """Offset of data page relative to column chunk offset (int)."""
+        """Offset of data page relative to beginning of the file (int)."""
         return self.metadata.data_page_offset()
 
     @property

Reply via email to