This is an automated email from the ASF dual-hosted git repository.

jorisvandenbossche pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/master by this push:
     new 43670af02f ARROW-17583: [C++][Python] Changed datawidth of 
WrittenFile.size to int64 to match C++ code (#14032)
43670af02f is described below

commit 43670af02f0913580fd20e26006fd550d6fdf2da
Author: Joost Hoozemans <[email protected]>
AuthorDate: Thu Sep 8 10:36:48 2022 +0200

    ARROW-17583: [C++][Python] Changed datawidth of WrittenFile.size to int64 
to match C++ code (#14032)
    
    To fix an exception while writing large parquet files:
    ```
    Traceback (most recent call last):
      File "pyarrow/_dataset_parquet.pyx", line 165, in 
pyarrow._dataset_parquet.ParquetFileFormat._finish_write
      File "pyarrow/dataset.pyx", line 2695, in 
pyarrow._dataset.WrittenFile.init_
    OverflowError: value too large to convert to int
    Exception ignored in: 'pyarrow._dataset._filesystemdataset_write_visitor'
    ```
    
    Authored-by: Joost Hoozemans <[email protected]>
    Signed-off-by: Joris Van den Bossche <[email protected]>
---
 python/pyarrow/_dataset.pxd | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/python/pyarrow/_dataset.pxd b/python/pyarrow/_dataset.pxd
index 8e5501fa16..a512477d50 100644
--- a/python/pyarrow/_dataset.pxd
+++ b/python/pyarrow/_dataset.pxd
@@ -161,4 +161,4 @@ cdef class WrittenFile(_Weakrefable):
     # the written file.
     cdef public object metadata
     # The size of the file in bytes
-    cdef public int size
+    cdef public int64_t size

Reply via email to