omatthew98 commented on PR #45471: URL: https://github.com/apache/arrow/pull/45471#issuecomment-2682970768
> Just fyi this might cause backward incompatibility issue because the user defined extension types are not expecting `maps_as_pydicts` as an argument for `to_py()`. We are seeing this in our prerelease tests: > > ``` > _________________________ test_json_arrow_record_batch _________________________ > > def test_json_arrow_record_batch(): > data = [ > json.dumps(value, sort_keys=True, separators=(",", ":")) > for value in JSON_DATA.values() > ] > arr = pa.array(data, type=db_dtypes.JSONArrowType()) > batch = pa.RecordBatch.from_arrays([arr], ["json_col"]) > sink = pa.BufferOutputStream() > > with pa.RecordBatchStreamWriter(sink, batch.schema) as writer: > writer.write_batch(batch) > > buf = sink.getvalue() > > with pa.ipc.open_stream(buf) as reader: > result = reader.read_all() > > json_col = result.column("json_col") > assert isinstance(json_col.type, db_dtypes.JSONArrowType) > > > s = json_col.to_pylist() > > tests/unit/test_json.py:225: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > pyarrow/table.pxi:1380: in pyarrow.lib.ChunkedArray.to_pylist > ??? > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > > > ??? > E TypeError: JSONArrowScalar.as_py() got an unexpected keyword argument 'maps_as_pydicts' > > pyarrow/array.pxi:1[67](https://github.com/googleapis/python-db-dtypes-pandas/actions/runs/13017353125/job/37736481135?pr=310#step:5:68)7: TypeError > - generated xml file: /home/runner/work/python-db-dtypes-pandas/python-db-dtypes-pandas/unit_prerelease_3.12_sponge_log.xml - > =========================== short test summary info ============================ > FAILED tests/unit/test_json.py::test_json_arrow_to_pylist - TypeError: JSONArrowScalar.as_py() got an unexpected keyword argument 'maps_as_pydicts' > FAILED tests/unit/test_json.py::test_json_arrow_record_batch - TypeError: JSONArrowScalar.as_py() got an unexpected keyword argument 'maps_as_pydicts' > 2 failed, 298 passed in 1.[81](https://github.com/googleapis/python-db-dtypes-pandas/actions/runs/13017353125/job/37736481135?pr=310#step:5:82)s > ``` > > (Link: https://github.com/googleapis/python-db-dtypes-pandas/actions/runs/13017353125/job/37736481135?pr=310) We (Ray Data team) are also running into backward compatibility issues like this in our tests against pyarrow nightly with the same error mentioned here: ```python [2025-02-25T06:27:20Z] =================================== FAILURES =================================== -- | [2025-02-25T06:27:20Z] ____________ test_convert_to_pyarrow_array_object_ext_type_fallback ____________ | [2025-02-25T06:27:20Z] | [2025-02-25T06:27:20Z] def test_convert_to_pyarrow_array_object_ext_type_fallback(): | [2025-02-25T06:27:20Z] column_values = create_ragged_ndarray( | [2025-02-25T06:27:20Z] [ | [2025-02-25T06:27:20Z] "hi", | [2025-02-25T06:27:20Z] 1, | [2025-02-25T06:27:20Z] None, | [2025-02-25T06:27:20Z] [[[[]]]], | [2025-02-25T06:27:20Z] {"a": [[{"b": 2, "c": UserObj(i=123)}]]}, | [2025-02-25T06:27:20Z] UserObj(i=456), | [2025-02-25T06:27:20Z] ] | [2025-02-25T06:27:20Z] ) | [2025-02-25T06:27:20Z] column_name = "py_object_column" | [2025-02-25T06:27:20Z] | [2025-02-25T06:27:20Z] # First, assert that straightforward conversion into Arrow native types fails | [2025-02-25T06:27:20Z] with pytest.raises(ArrowConversionError) as exc_info: | [2025-02-25T06:27:20Z] _convert_to_pyarrow_native_array(column_values, column_name) | [2025-02-25T06:27:20Z] | [2025-02-25T06:27:20Z] assert ( | [2025-02-25T06:27:20Z] str(exc_info.value) | [2025-02-25T06:27:20Z] == "Error converting data to Arrow: ['hi' 1 None list([[[[]]]]) {'a': [[{'b': 2, 'c': UserObj(i=123)}]]}\n UserObj(i=456)]" # noqa: E501 | [2025-02-25T06:27:20Z] ) | [2025-02-25T06:27:20Z] | [2025-02-25T06:27:20Z] # Subsequently, assert that fallback to `ArrowObjectExtensionType` succeeds | [2025-02-25T06:27:20Z] pa_array = convert_to_pyarrow_array(column_values, column_name) | [2025-02-25T06:27:20Z] | [2025-02-25T06:27:20Z] > assert pa_array.to_pylist() == column_values.tolist() | [2025-02-25T06:27:20Z] | [2025-02-25T06:27:20Z] python/ray/air/tests/test_arrow.py:121: | [2025-02-25T06:27:20Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ | [2025-02-25T06:27:20Z] | [2025-02-25T06:27:20Z] > ??? | [2025-02-25T06:27:20Z] E TypeError: as_py() got an unexpected keyword argument 'maps_as_pydicts' ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org