Igosuki opened a new issue #925:
URL: https://github.com/apache/arrow-datafusion/issues/925


   **Is your feature request related to a problem or challenge? Please describe 
what you are trying to do.**
   Arrow's python and R packages already have the concept of reading from 
alternative file systems such as s3 (not necessarily limited to AWS here).
   
   **Describe the solution you'd like**
   Given a URI, datafusion can determine the proper FileSystem implementation 
to use to read the file, 'file://' giving LocalFileSystem being the default.
   Optionally, caching files could be added, which would be a great addition to 
limit costs.
   
   **Describe alternatives you've considered**
   Cloning files with a tool like RClone or any provided cli tool such as gcp 
or aws cli to clone files before working on them.
   
   **Additional context**
   Lower barrier of entry to trying out arrow-datafusion, also datafusion could 
then be used to manage partition cache and prune unused data in tables.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to