omatthew98 commented on PR #45471:
URL: https://github.com/apache/arrow/pull/45471#issuecomment-2682970768

   > Just fyi this might cause backward incompatibility issue because the user 
defined extension types are not expecting `maps_as_pydicts` as an argument for 
`to_py()`. We are seeing this in our prerelease tests:
   > 
   > ```
   > _________________________ test_json_arrow_record_batch 
_________________________
   > 
   >     def test_json_arrow_record_batch():
   >         data = [
   >             json.dumps(value, sort_keys=True, separators=(",", ":"))
   >             for value in JSON_DATA.values()
   >         ]
   >         arr = pa.array(data, type=db_dtypes.JSONArrowType())
   >         batch = pa.RecordBatch.from_arrays([arr], ["json_col"])
   >         sink = pa.BufferOutputStream()
   >     
   >         with pa.RecordBatchStreamWriter(sink, batch.schema) as writer:
   >             writer.write_batch(batch)
   >     
   >         buf = sink.getvalue()
   >     
   >         with pa.ipc.open_stream(buf) as reader:
   >             result = reader.read_all()
   >     
   >         json_col = result.column("json_col")
   >         assert isinstance(json_col.type, db_dtypes.JSONArrowType)
   >     
   > >       s = json_col.to_pylist()
   > 
   > tests/unit/test_json.py:225: 
   > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ 
   > pyarrow/table.pxi:1380: in pyarrow.lib.ChunkedArray.to_pylist
   >     ???
   > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ 
   > 
   > >   ???
   > E   TypeError: JSONArrowScalar.as_py() got an unexpected keyword argument 
'maps_as_pydicts'
   > 
   > 
pyarrow/array.pxi:1[67](https://github.com/googleapis/python-db-dtypes-pandas/actions/runs/13017353125/job/37736481135?pr=310#step:5:68)7:
 TypeError
   > - generated xml file: 
/home/runner/work/python-db-dtypes-pandas/python-db-dtypes-pandas/unit_prerelease_3.12_sponge_log.xml
 -
   > =========================== short test summary info 
============================
   > FAILED tests/unit/test_json.py::test_json_arrow_to_pylist - TypeError: 
JSONArrowScalar.as_py() got an unexpected keyword argument 'maps_as_pydicts'
   > FAILED tests/unit/test_json.py::test_json_arrow_record_batch - TypeError: 
JSONArrowScalar.as_py() got an unexpected keyword argument 'maps_as_pydicts'
   > 2 failed, 298 passed in 
1.[81](https://github.com/googleapis/python-db-dtypes-pandas/actions/runs/13017353125/job/37736481135?pr=310#step:5:82)s
   > ```
   > 
   > (Link: 
https://github.com/googleapis/python-db-dtypes-pandas/actions/runs/13017353125/job/37736481135?pr=310)
   
   We (Ray Data team) are also running into backward compatibility issues like 
this in our tests against pyarrow nightly with the same error mentioned here:
   ```python
   
   [2025-02-25T06:27:20Z] =================================== FAILURES 
===================================
   --
     | [2025-02-25T06:27:20Z] ____________ 
test_convert_to_pyarrow_array_object_ext_type_fallback ____________
     | [2025-02-25T06:27:20Z]
     | [2025-02-25T06:27:20Z]     def 
test_convert_to_pyarrow_array_object_ext_type_fallback():
     | [2025-02-25T06:27:20Z]         column_values = create_ragged_ndarray(
     | [2025-02-25T06:27:20Z]             [
     | [2025-02-25T06:27:20Z]                 "hi",
     | [2025-02-25T06:27:20Z]                 1,
     | [2025-02-25T06:27:20Z]                 None,
     | [2025-02-25T06:27:20Z]                 [[[[]]]],
     | [2025-02-25T06:27:20Z]                 {"a": [[{"b": 2, "c": 
UserObj(i=123)}]]},
     | [2025-02-25T06:27:20Z]                 UserObj(i=456),
     | [2025-02-25T06:27:20Z]             ]
     | [2025-02-25T06:27:20Z]         )
     | [2025-02-25T06:27:20Z]         column_name = "py_object_column"
     | [2025-02-25T06:27:20Z]
     | [2025-02-25T06:27:20Z]         # First, assert that straightforward 
conversion into Arrow native types fails
     | [2025-02-25T06:27:20Z]         with pytest.raises(ArrowConversionError) 
as exc_info:
     | [2025-02-25T06:27:20Z]             
_convert_to_pyarrow_native_array(column_values, column_name)
     | [2025-02-25T06:27:20Z]
     | [2025-02-25T06:27:20Z]         assert (
     | [2025-02-25T06:27:20Z]             str(exc_info.value)
     | [2025-02-25T06:27:20Z]             == "Error converting data to Arrow: 
['hi' 1 None list([[[[]]]]) {'a': [[{'b': 2, 'c': UserObj(i=123)}]]}\n 
UserObj(i=456)]"  # noqa: E501
     | [2025-02-25T06:27:20Z]         )
     | [2025-02-25T06:27:20Z]
     | [2025-02-25T06:27:20Z]         # Subsequently, assert that fallback to 
`ArrowObjectExtensionType` succeeds
     | [2025-02-25T06:27:20Z]         pa_array = 
convert_to_pyarrow_array(column_values, column_name)
     | [2025-02-25T06:27:20Z]
     | [2025-02-25T06:27:20Z] >       assert pa_array.to_pylist() == 
column_values.tolist()
     | [2025-02-25T06:27:20Z]
     | [2025-02-25T06:27:20Z] python/ray/air/tests/test_arrow.py:121:
     | [2025-02-25T06:27:20Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _
     | [2025-02-25T06:27:20Z]
     | [2025-02-25T06:27:20Z] >   ???
     | [2025-02-25T06:27:20Z] E   TypeError: as_py() got an unexpected keyword 
argument 'maps_as_pydicts'
   
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to