westonpace commented on issue #33986:
URL: https://github.com/apache/arrow/issues/33986#issuecomment-1416351103

   From an implementation perspective I suspect we can satisfy any of these 
proposed APIs.  If we need to come up with a new API then my preference is 
Substrait, but if the consensus heads in some other direction I'm fine with 
that too.
   
   @jorisvandenbossche , your proposal seems fine, but I don't see anything in 
there for filesystems.  I think this is for on-disk data moreso than purely 
in-memory data.  Though I believe your approach could be adapted to include 
filesystems.
   
   > Does object store rs work for this?
   
   Yes, I would assume that object store rs would be able to satisfy this but 
I'm not familiar with the capabilities.  For example, my idea of how this would 
work in Substrait would be:
   
   ```
   # This would be usable as ReadRel::read_type
   message Dataset {
   
     # This is already definedin ReadRel and is basically a list of files
     # and a format object which defines things like delimiter (for CSV)
     LocalFiles files = 0;
     oneof filesystem {
       LocalFilesystem = 1;
       S3Filesystem = 2;
       ExtensionFilesystem = 3;
     }
   
     message LocalFilesystem {}
     message S3Filesystem {
       string region;
       string client_id;
       string client_secret; // could be omitted if credentials negotiated 
elsewhere
     }
     message ExtensionFilesystem {
       google.protobuf.Any details;
     }
   
   }
   ```
   
   The equivalent C interface would just be structifying those messages.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to