rishav394 opened a new issue, #4363:
URL: https://github.com/apache/arrow-adbc/issues/4363

   ## What happened?
   
   `fetchallarrow()` segfaults (exit 139) when the driver produces an 
`ArrowArrayStream` containing types the installed PyArrow doesn't recognize. No 
exception, no error message - just a process kill.
   
   This isn't specific to one type. Any time the Arrow spec adds a new type and 
a driver (built on a newer arrow-go/arrow-rs) exports it, older PyArrow 
consumers will crash at `_import_from_c`. The C Data Interface is meant for 
cross-version interop, so an unrecognized type should be a recoverable error, 
not a segfault.
   
   Concrete trigger: any `driverbase-go` driver (Trino, BigQuery, Redshift, 
MySQL, etc.) returning a DECIMAL column. `driverbase-go` uses 
`NarrowestDecimalType()` which picks Decimal64 (format `d:10,4,64`). PyArrow < 
15 doesn't know this format and crashes.
   
   Expected: `NotImplementedError` with a message like "Unsupported format 
string 'd:10,4,64'. Upgrade PyArrow to >= 15.0.0."
   
   ## Stack Trace
   
   ```
   Fatal Python error: Segmentation fault
   
   Current thread 0x00000001f064df00 (most recent call first):
     File ".../adbc_driver_manager/_reader.pyx", line 65 in _import_from_c
     File ".../adbc_driver_manager/dbapi.py", line 1346 in fetch_arrow_table
     File ".../adbc_driver_manager/dbapi.py", line 1179 in fetch_arrow_table
     File ".../adbc_driver_manager/dbapi.py", line 1162 in fetchallarrow
   ```
   
   Crash at `_reader.pyx:65`:
   ```python
   reader = pyarrow.RecordBatchReader._import_from_c(int(address))
   ```
   
   ## How can we reproduce the bug?
   
   No tables or data needed. Any Trino instance (or any driverbase-go driver):
   
   ```python
   import faulthandler
   import adbc_driver_manager.dbapi as adbc_manager
   
   faulthandler.enable()
   
   conn = adbc_manager.connect(
       driver="trino",
       db_kwargs={"uri": "https://user:pass@trino-host:443/catalog/schema"},
   )
   cur = conn.cursor()
   cur.execute("SELECT CAST(10.1 AS DECIMAL(10,4)) AS val")
   cur.fetchallarrow()  # SIGSEGV
   ```
   
   Not Decimal-specific. Any unknown format string in the ArrowSchema will 
trigger the same crash. Decimal32/64 is just the most common real-world trigger 
today.
   
   Workarounds:
   - Upgrade PyArrow to >= 15.0.0
   - `CAST(col AS DECIMAL(19, scale))` - forces Decimal128 (universally 
supported)
   - `CAST(col AS DOUBLE)`
   
   ## Environment/Setup
   
   - adbc-driver-manager: 1.8.0 (pip)
   - PyArrow: 11.0.0 through 14.0.2 (crashes), 15.0.0+ (works)
   - Driver: adbc-driver-trino 0.3.1 via `dbc install trino`
   - driverbase-go: v0.0.0-20260423045143 (uses arrow-go v18.6.0)
   - Platform: macOS arm64, Python 3.9
   - Package manager: pip


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to