bretttully opened a new pull request, #44720:
URL: https://github.com/apache/arrow/pull/44720

   ### Rationale for this change
   
   This is a long standing 
[ticket](https://github.com/pandas-dev/pandas/issues/53011) with some fairly 
horrible workarounds, where complex arrow types do not serialise well to pandas 
as the pandas metadata string is not parseable. However, `types_mapper` always 
had highest priority as it overrode what was set before. 
   
   ### What changes are included in this PR?
   
   By switching the logical ordering, it means that we don't need to call 
`_pandas_api.pandas_dtype(dtype)` when using the pyarrow backend, thus 
resolving the issue of complex `dtype` with `list` or `struct`. It will likely 
still fail if the numpy backend is used, but at least this gives a working 
solution rather than an inability to load files at all.
   
   ### Are these changes tested?
   
   Existing tests should stay unchanged and a new test for the complex type has 
been added
   
   ### Are there any user-facing changes?
   
   <!--
   Please uncomment the line below (and provide explanation) if the changes fix 
either (a) a security vulnerability, (b) a bug that caused incorrect or invalid 
data to be produced, or (c) a bug that causes a crash (even when the API 
contract is upheld). We use this to highlight fixes to issues that may affect 
users without their knowledge. For this reason, fixing bugs that cause errors 
don't count, since those are usually obvious.
   -->
   **This PR contains a "Critical Fix".**
   This makes `pd.read_parquet(..., dtype_backend="pyarrow")` work with complex 
data types where the metadata added by pyarrow during `pd.to_parquet` is not 
serialisable and currently throwing an exception. This issue currently prevents 
the use of pyarrow as the default backend for pandas.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to