lupko opened a new issue, #1523:
URL: https://github.com/apache/arrow-adbc/issues/1523

   Hello,
   
   I'm trying to integrate ADBC into a Flight RPC service. Typical use case is 
that a call (GetFlightInfo) performs a query and then client comes to pick 
stream of results via DoGet. So on GetFlightInfo service creates new cursor, 
executes and sends appropriate FlightInfo so that DoGet with the right ticket 
will pick up the data. For this the code does `cursor.fetch_record_batch()`. 
The resulting RecordBatchReader is then wrapped into 
`pyarrow.flight.RecordBatchStream` and returned.
   
   Doing this crashes the entire server with SIGSEGV. You can find the 
reproducer in this gist: 
https://gist.github.com/lupko/8b6f165a6574ef830c531c8056b20957. The reproducer 
skips the GetFlightInfo for sakes of brevity.
   
   Poking around the code of ADBC python wrappers, I _think_ this crash happens 
because `AdbcRecordBatchReader` is not ready for interop with PyArrow . 
   
   The PyArrow's `RecordBatchReader` (`pyarrow.lib.RecordBatchReader`) has 
`reader` field that contains the actual C++ RecordBatchReader. PyArrow code 
usually grabs the actual `reader` as soon as possible and uses it for the 
different purposes (like getting batches to send out via Flight RPC).
   
   The `AdbcRecordBatchReader` does not extend `pyarrow.lib.RecordBatchReader` 
and so `reader` is just not there and everything comes crashing down.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to