rishav394 opened a new pull request, #4366: URL: https://github.com/apache/arrow-adbc/pull/4366
## What Use-after-free in `_reader.pyx` causes SIGSEGV when `pyarrow.RecordBatchReader._import_from_c` rejects the stream's schema (e.g. unsupported format string like Decimal32/64 on PyArrow < 15). ## Root cause `_import_from_c` shallow-copies the `ArrowArrayStream`, passes the original to PyArrow. On failure, PyArrow calls `release()` on the stream (per Arrow C Data Interface spec), setting `release = NULL`. Then `check_error(e)` dereferences the now-freed stream through the shallow copy, triggering a segfault. ## Fix After `_import_from_c` raises, check if `c_stream.release == NULL`. If so, PyArrow already released the stream - re-raise the original exception directly instead of calling `check_error` on dangling memory. ## Reproduction Minimal: any ADBC driver returning a schema with a format string unsupported by the consumer's PyArrow version (e.g. Decimal32/64 from arrow-go based drivers consumed by PyArrow < 15). ```python # Trino + adbc-driver-flightsql or adbc-driver-trino # SELECT CAST(10.1 AS DECIMAL(10,4)) # -> segfault on PyArrow 11-14 ``` ## Test Added `test_import_invalid_format_raises` - poisons an exported stream's child format with an invalid string, verifies `ArrowInvalid` is raised (not a crash). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
