AlenkaF commented on code in PR #48255:
URL: https://github.com/apache/arrow/pull/48255#discussion_r2642540189
##########
python/pyarrow/parquet/core.py:
##########
@@ -2422,8 +2429,13 @@ def read_schema(where, memory_map=False,
decryption_properties=None,
with file_ctx:
file = ParquetFile(
- where, memory_map=memory_map,
- decryption_properties=decryption_properties)
+ where,
+ memory_map=memory_map,
+ decryption_properties=decryption_properties,
+ arrow_extensions_enabled=arrow_extensions_enabled,
+ )
+ if arrow_extensions_enabled:
+ return file.schema_arrow
Review Comment:
We might want to use `schema_arrow` in any case?
##########
python/pyarrow/parquet/core.py:
##########
@@ -2347,6 +2347,10 @@ def read_metadata(where, memory_map=False,
decryption_properties=None,
If nothing passed, will be inferred based on path.
Path will try to be found in the local on-disk filesystem otherwise
it will be parsed as an URI to determine the filesystem.
+ arrow_extensions_enabled : bool, default True
Review Comment:
If `arrow_extensions_enabled` is to be kept in `read_metadata` it should be
added to `read_metadata` signature above and passed to `ParquetFile` below.
##########
python/pyarrow/tests/parquet/test_data_types.py:
##########
@@ -604,6 +604,25 @@ def test_uuid_extension_type():
store_schema=False)
+def test_read_schema_uuid_extension_type(tmp_path):
Review Comment:
I would suggest moving this test to `tests/parquet/test_metadata.py` and
also add a check for `read_metadata`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]