This is an automated email from the ASF dual-hosted git repository.
jorisvandenbossche pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git
The following commit(s) were added to refs/heads/master by this push:
new 43670af02f ARROW-17583: [C++][Python] Changed datawidth of
WrittenFile.size to int64 to match C++ code (#14032)
43670af02f is described below
commit 43670af02f0913580fd20e26006fd550d6fdf2da
Author: Joost Hoozemans <[email protected]>
AuthorDate: Thu Sep 8 10:36:48 2022 +0200
ARROW-17583: [C++][Python] Changed datawidth of WrittenFile.size to int64
to match C++ code (#14032)
To fix an exception while writing large parquet files:
```
Traceback (most recent call last):
File "pyarrow/_dataset_parquet.pyx", line 165, in
pyarrow._dataset_parquet.ParquetFileFormat._finish_write
File "pyarrow/dataset.pyx", line 2695, in
pyarrow._dataset.WrittenFile.init_
OverflowError: value too large to convert to int
Exception ignored in: 'pyarrow._dataset._filesystemdataset_write_visitor'
```
Authored-by: Joost Hoozemans <[email protected]>
Signed-off-by: Joris Van den Bossche <[email protected]>
---
python/pyarrow/_dataset.pxd | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/python/pyarrow/_dataset.pxd b/python/pyarrow/_dataset.pxd
index 8e5501fa16..a512477d50 100644
--- a/python/pyarrow/_dataset.pxd
+++ b/python/pyarrow/_dataset.pxd
@@ -161,4 +161,4 @@ cdef class WrittenFile(_Weakrefable):
# the written file.
cdef public object metadata
# The size of the file in bytes
- cdef public int size
+ cdef public int64_t size