[GitHub] [arrow-adbc] prmoore77 opened a new issue, #968: Please provide `fetch_arrow_reader` method to ADBC Cursor class in DBAPI

via GitHub Mon, 07 Aug 2023 06:15:37 -0700


prmoore77 opened a new issue, #968:
URL: https://github.com/apache/arrow-adbc/issues/968


   Please provide a way to access the batch reader from an ADBC DBAPI cursor 
that doesn't require using an underscore method/attribute.
   
   Giving more direct access to the batch reader will allow folks to write out 
batches of records fetched from Flight SQL (or other ADBC sources), requiring 
less memory - as they don't have to first fetch the entire result set into 
memory.
   
   Currently - in order to access the batch reader from a Flight SQL server 
using the Python ADBC Flight SQL driver (and DBAPI) - one has to use an 
underscore method (I believe) - as demonstrated by this Python code:
   
   ```
   import os
   import adbc_driver_flightsql.dbapi as flight_sql
   import pyarrow.parquet as pq
   
   
   def main():
       with flight_sql.connect(uri=f"grpc+tls://localhost:31337",
                               db_kwargs={"username": "flight_username",
                                          "password": 
os.environ["FLIGHT_PASSWORD"],
                                          
"adbc.flight.sql.client_option.tls_skip_verify": "true"
                                          }
                               ) as conn:
           with conn.cursor() as cur:
               cur.execute(operation="SELECT * FROM orders")
               reader = cur._results._reader  # We have to use an underscore 
attribute here...
               writer = pq.ParquetWriter(where="orders.parquet", 
schema=reader.schema)
               total_rows: int = 0
               for batch in reader:
                   writer.write_batch(batch=batch)
                   total_rows += batch.num_rows
                   print(f"Wrote batch of {batch.num_rows:,d} row(s) - total 
row(s) written thus far: {total_rows:,d}")
   
               print(f"Total number of rows written: {total_rows:,d}")
   
   
   if __name__ == "__main__":
       main()
   ```
   
   Please add a method called something like: `fetch_arrow_reader` to: 
`dbapi.Cursor` to allow a more direct way to get the batch reader.
   
   Thank you. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-adbc] prmoore77 opened a new issue, #968: Please provide `fetch_arrow_reader` method to ADBC Cursor class in DBAPI

Reply via email to