alamb commented on issue #2445: URL: https://github.com/apache/arrow-datafusion/issues/2445#issuecomment-1119781793
> Are there examples of ObjectStore implementations? The canonical example of `ObjectStore` is AWS's S3: https://aws.amazon.com/s3/ and then there are many distributed storage systems that present a similar interface, as @tustvold describes in https://github.com/apache/arrow-datafusion/issues/2445#issuecomment-1119739041 The idea of the "ObjectStore" interface in DataFusion was to provide API access to the lowest common denominator feature set across several storage implementations. For example, here are three implementations for S3, HDFS, and Azure specifically: * https://github.com/datafusion-contrib/datafusion-objectstore-s3 * https://github.com/datafusion-contrib/datafusion-objectstore-hdfs * https://github.com/datafusion-contrib/datafusion-objectstore-azure In terms of "glob"ing, that is typically not a feature provided by object stores (e.g. there is no such thing in S3, which instead offers a much more restricted notion of `prefix`es). Thus, it seems to me if we want to support globbing for DataFusion when running on local files, it will have to be a special case somehow. You can see another example of a Rust API to object storage in IOx: https://github.com/influxdata/influxdb_iox/blob/main/object_store -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
