wjones127 opened a new issue, #2489: URL: https://github.com/apache/arrow-datafusion/issues/2489
**Is your feature request related to a problem or challenge? Please describe what you are trying to do.** In another issue @alamb and @tustvold [suggested](https://github.com/apache/arrow-datafusion/issues/2445#issuecomment-1119804996) we might want to use the [IOx ObjectStore implementation](https://github.com/influxdata/influxdb_iox/blob/main/object_store). A few nice points I'll mention about the IOx one: * They have some nice path utilities, including [a CloudPath struct](https://github.com/influxdata/influxdb_iox/blob/main/object_store/src/path/cloud.rs). That seems nicer than the current one with `&str` paths. * Has implementations for S3, GCS, Azure Blob Storage included in the repo. There is no HDFS support yet. * Has implementations of `put()` for writing. There doesn't seem to be streaming write support (multi-part upload). There are a few differences in the API: Current API: https://github.com/apache/arrow-datafusion/blob/dfdeb42d7d646cffcf3cff26beefcecffc6cbe62/data-access/src/object_store/mod.rs#L77 IOx API: https://github.com/influxdata/influxdb_iox/blob/94e9ac610acfb94870154d976f66a4d4111b5668/object_store/src/lib.rs#L74 * The IOx `list()` implementation evaluated prefixes on path segments: "Prefixes are evaluated on a path segment basis, i.e. `foo/bar/` is a prefix of `foo/bar/x` but not of `foo/bar_baz/x`." * IOx doesn't have a synchronous read implementation. There of course exist other repos that this has implications for: * https://github.com/datafusion-contrib/datafusion-objectstore-s3 * https://github.com/datafusion-contrib/datafusion-objectstore-hdfs * https://github.com/datafusion-contrib/datafusion-objectstore-azure From what I've seen, it seems like we could reasonably shift to simply use the IOx ObjectStore. But if there's a good reason, we could also reuse useful parts of the implementation to keep the existing API. cc @matthewmturner @kyotoYaho @roeap -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
