[I] feat(api): decouple reading partitions from AdbcConnection [arrow-adbc]

via GitHub Thu, 08 Feb 2024 23:12:24 -0800


jduo opened a new issue, #1537:
URL: https://github.com/apache/arrow-adbc/issues/1537


   ### What feature or improvement would you like to see?
   
   Currently to read data from a partition, using Java as an example:
   1. a user must use AdbcStatement.executePartitioned() to get a 
PartitionedResult containing a list of PartitionDescriptors.
   2. For each PartitionDescriptor the user must call 
AdbcConnection.readPartitioned().
   
   If the user intends to distribute the work across separate processes (or 
nodes in a distributed system), they must have each process go through the work 
of creating an AdbcDatabase, building up connection options, then connecting to 
a node to create a full-fledged AdbcConnection. This could be costly -- for 
example this may requiring re-running the auth workflow when an auth token was 
generated already or creating a session object instead of re-using the session 
already established.
   
   An idea would be to have AdbcDatabase let the caller construct an 
AdbcPartitionedReader from a PartitionDescriptor instead of requiring the full 
connection process. The driver implementation can bake all the details on how 
to connect to and use the node holding the partition (including stateful 
information such as auth access tokens or session identifiers) so it can skip 
the potentially heavy-weight connection process.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[I] feat(api): decouple reading partitions from AdbcConnection [arrow-adbc]

Reply via email to