jorisvandenbossche commented on PR #702:
URL: https://github.com/apache/arrow-adbc/pull/702#issuecomment-1582073523
The last commit I pushed contains an implementation of "wrap stream in a new
stream keeping reference to python object" (as far as I understood, largely
based on Dewey's example from nanoarrow-r).
What it can do is avoiding a segfault from trying to consume the stream
outside of the connection/cursor context (or after closing the cursor and
connection):
```python
import adbc_driver_sqlite.dbapi
import adbc_driver_manager
with adbc_driver_sqlite.dbapi.connect("test_db.sqlite") as conn:
with conn.cursor() as cursor:
# this still needs to be integrated in the DBAPI, eg as `capsule =
cursor.fetch_arrow_stream()`
cursor._prepare_execute("SELECT * FROM test_arrow")
capsule, _ = cursor._stmt.execute_query()
# without the next line (i.e. using the "raw" stream), this snippet
segfaults
capsule = adbc_driver_manager._lib.export_array_stream(capsule,
cursor._stmt)
import pyarrow as pa
table = pa.RecordBatchReader._import_from_c_capsule(capsule).read_all()
```
Without the "export array stream" with the wrapper, this example segfaults
(because it tries to get the schema and next batch of a stream that references
a closed transaction). With exporting the stream as a wrapped stream, the
example raises a proper python exception about trying to consume the stream
from a closed transaction.
Now, that exception only happens because in the wrapped ArrowArrayStream
methods, I added a check for a closed transaction. But I do wonder if that is
not something that the actual driver implementation could also do in their
implementation of get_schema/get_next (I don't know the driver's
implementations well enough, though), so I don't have to do this here in the
wrapper.
The wrapper implementation that I now added adds quite some complexity
(although it is nice that we can avoid the python snippet causing a segfault),
but wondering if we can't say it is the responsibility of the driver to return
an error in case of a closed transaction.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]