This is an automated email from the ASF dual-hosted git repository.
maplefu pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow.git
The following commit(s) were added to refs/heads/main by this push:
new 32de498ca7 GH-44668: [Docs] Fix ColumnChunkMetaData offset
documentation in pyarrow (#44670)
32de498ca7 is described below
commit 32de498ca7dba5861f22eee5e4527446f6218b7a
Author: GeorgKreuzmayr <[email protected]>
AuthorDate: Thu Nov 7 10:10:51 2024 +0100
GH-44668: [Docs] Fix ColumnChunkMetaData offset documentation in pyarrow
(#44670)
### Rationale for this change
The pyarrow documentation of ColumnMetaData is contradicting the C++
implementation.
The pyarrow
[documentation](https://arrow.apache.org/docs/dev/python/generated/pyarrow.parquet.ColumnChunkMetaData.html#pyarrow.parquet.ColumnChunkMetaData.data_page_offset)
says:
The data_page_offset and dictionary_page_offset are relative to the column
chunk offset
The C++ [comments in the
code](https://github.com/apache/arrow/blob/df24a8225999896eb03db280354fbff42dfea0f5/cpp/src/generated/parquet_types.h#L2896)
say:
The offsets are byte offsets from the beginning of the file to first
data_page / dictionary_page
### What changes are included in this PR?
Update comments that `data_page_offset` and `dictionary_page_offset` are
relative to start of file
### Are these changes tested?
Verified locally that C++ code comments are correct
### Are there any user-facing changes?
Documentation
GitHub Issue: https://github.com/apache/arrow/issues/44668
* GitHub Issue: #44668
Authored-by: Georg Kreuzmayr <[email protected]>
Signed-off-by: mwish <[email protected]>
---
python/pyarrow/_parquet.pyx | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/python/pyarrow/_parquet.pyx b/python/pyarrow/_parquet.pyx
index 254bfe3b09..a3abf1865b 100644
--- a/python/pyarrow/_parquet.pyx
+++ b/python/pyarrow/_parquet.pyx
@@ -467,7 +467,7 @@ cdef class ColumnChunkMetaData(_Weakrefable):
@property
def dictionary_page_offset(self):
- """Offset of dictionary page relative to column chunk offset (int)."""
+ """Offset of dictionary page relative to beginning of the file
(int)."""
if self.has_dictionary_page:
return self.metadata.dictionary_page_offset()
else:
@@ -475,7 +475,7 @@ cdef class ColumnChunkMetaData(_Weakrefable):
@property
def data_page_offset(self):
- """Offset of data page relative to column chunk offset (int)."""
+ """Offset of data page relative to beginning of the file (int)."""
return self.metadata.data_page_offset()
@property