thinkharderdev commented on code in PR #5543:
URL: https://github.com/apache/arrow-datafusion/pull/5543#discussion_r1133860499
##########
datafusion/execution/src/object_store.rs:
##########
@@ -89,6 +89,138 @@ pub trait ObjectStoreProvider: Send + Sync + 'static {
fn get_by_url(&self, url: &Url) -> Result<Arc<dyn ObjectStore>>;
}
+/// Provides a mechanism to get and put object stores.
+pub trait ObjectStoreManager: Send + Sync + std::fmt::Debug + 'static {
+ /// If a store with the same schema and host existed before, it is
replaced and returned
+ fn register_store(
+ &self,
+ scheme: &str,
+ host: &str,
+ store: Arc<dyn ObjectStore>,
+ ) -> Option<Arc<dyn ObjectStore>>;
+
+ /// Get a suitable store for the provided URL. For example:
+ ///
+ /// - URL with scheme `file:///` or no schema will return the default
LocalFS store
+ /// - URL with scheme `s3://bucket/` will return the S3 store
+ /// - URL with scheme `hdfs://hostname:port/` will return the hdfs store
+ fn get_by_url(&self, url: &Url) -> Result<Arc<dyn ObjectStore>>;
Review Comment:
`register_store` already exists on `ObjectStoreRegistry` so this would
essentially make `ObjectStoreRegistry` and `ObjectStoreProvider` have the same
interface which I think is confusing. So as I understand it, the role of
`ObjectStoreProvider` is just to allow lazy construction. Manual registration
can already be done through the registry. So it just seems to me like we could
serve the same use case by holding an `Arc<dyn ObjectStoreRegistry>` in
whatever is managing the cache layer.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]