rok commented on PR #44070:
URL: https://github.com/apache/arrow/pull/44070#issuecomment-2428700815

   @pitrou
   > What's the plan for the Parquet `arrow_extensions_enabled` option?
   
   Perhaps we should open another issue for it? Current implementation seems to 
roundtrip to parquet ok.
   I'd propose something like this:
   
   ```diff
   diff --git a/python/pyarrow/_parquet.pxd b/python/pyarrow/_parquet.pxd
   index d6aebd8284..32e2618ecf 100644
   --- a/python/pyarrow/_parquet.pxd
   +++ b/python/pyarrow/_parquet.pxd
   @@ -405,6 +405,7 @@ cdef extern from "parquet/api/reader.h" namespace 
"parquet" nogil:
            CCacheOptions cache_options() const
            void set_coerce_int96_timestamp_unit(TimeUnit unit)
            TimeUnit coerce_int96_timestamp_unit() const
   +        void set_arrow_extensions_enabled(c_bool enabled)
    
        ArrowReaderProperties default_arrow_reader_properties()
    
   diff --git a/python/pyarrow/_parquet.pyx b/python/pyarrow/_parquet.pyx
   index 254bfe3b09..6ae1726c71 100644
   --- a/python/pyarrow/_parquet.pyx
   +++ b/python/pyarrow/_parquet.pyx
   @@ -1441,7 +1441,8 @@ cdef class ParquetReader(_Weakrefable):
                 FileDecryptionProperties decryption_properties=None,
                 thrift_string_size_limit=None,
                 thrift_container_size_limit=None,
   -             page_checksum_verification=False):
   +             page_checksum_verification=False,
   +             arrow_extensions_enabled=False):
            """
            Open a parquet file for reading.
    
   @@ -1458,6 +1459,7 @@ cdef class ParquetReader(_Weakrefable):
            thrift_string_size_limit : int, optional
            thrift_container_size_limit : int, optional
            page_checksum_verification : bool, default False
   +        arrow_extensions_enabled: bool, default False
            """
            cdef:
                shared_ptr[CFileMetaData] c_metadata
   @@ -1522,6 +1524,9 @@ cdef class ParquetReader(_Weakrefable):
            if read_dictionary is not None:
                self._set_read_dictionary(read_dictionary, &arrow_props)
    
   +        if arrow_extensions_enabled:
   +            arrow_props.set_arrow_extensions_enabled(<c_bool>True)
   +
            with nogil:
                check_status(builder.memory_pool(self.pool)
                             .properties(arrow_props)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to