Tom-Newton commented on issue #18014:
URL: https://github.com/apache/arrow/issues/18014#issuecomment-1700666053

   I think we're ready to start implementing the filesystem itself. Looking at 
how [GCS was 
done](https://github.com/apache/arrow/commits/main?before=9b6be29f431705ce1f85cc218c66d4d03698f06b+35&branch=main&path%5B%5D=cpp&path%5B%5D=src&path%5B%5D=arrow&path%5B%5D=filesystem&path%5B%5D=gcsfs.cc&qualified_name=refs%2Fheads%2Fmain)
 the next part was an implementation of `arrow::io::InputStream` and 
`OpenInputStream`. 
   
   However it looks like `arrow::io::RandomAccessFile` is a superset of 
`arrow::io::InputStream` so I'm think it makes sense to just implement 
`arrow::io::RandomAccessFile`. This is what 
https://github.com/apache/arrow/pull/12914 did. 
   
   I would propose we move forward with implementing 
`arrow::io::RandomAccessFile`, `OpenInputStream` and `OpenInputFile` as in 
https://github.com/apache/arrow/pull/12914. One change I would suggest is to 
avoid depending on whether the storage account has hierarchical namespace 
enabled. Hierarchical namespace is important for listing and renames if you 
want to make them faster but for blob reads I don't think it should matter, and 
it adds complexity. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to