waynr opened a new issue, #14804: URL: https://github.com/apache/datafusion/issues/14804
### Is your feature request related to a problem or challenge? In https://github.com/apache/arrow-rs/issues/7155 I've described a general need for the `ObjectStore` trait to be able to support passing contextual data to custom implementations. In https://github.com/apache/arrow-rs/pull/7160 I have implemented and approach to this by providing the ability for `GetOptions` to store opaque instances of values indexed by their `TypeId`, [similar to what is possible in datafusion with `SessionConfig`](https://docs.rs/datafusion/latest/datafusion/prelude/struct.SessionConfig.html#method.with_extension). This issue is about taking incorporating this new behavior(s) in `ObjectStore` and incorporating it here in datafusion such that the custom data on a `SessionConfig` is passed on when creating `GetOptions`s instances for retrieving files from an object store. ### Describe the solution you'd like I think the simplest approach here would be one where we create a new `ObjectStore` implementation during query execution that looks something like: ``` struct ContextualizedObjectStore { inner: Arc<dyn ObjectStore>, extensions: object_store::Extensions, } ``` We would then have a `get_opts` method for the `ObjectStore` impl trait that looks something like: ``` async fn get_opts( &self, location: &Path, mut options: GetOptions, ) -> object_store::Result<GetResult> { options.extensions = self.extensions.clone(); self.inner.get_opts(location, options).await } ``` Initializing instances of this new type as a wrapper around whatever given `Arc<dyn ObjectStore>` is available would look something like: ``` let object_store = context .runtime_env() .object_store(&self.object_store_url) .map(|inner: Arc<dyn ObjectStore>| -> Arc<dyn ObjectStore> { Arc::new(ContextualizedObjectStore::new( inner, context.session_config().clone_extensions_for_object_store(), )) }); ``` With this approach, whenever the resulting `Arc<dyn ObjectStore>` is used to retrieve a file from object store, the underlying implementation would have access to the `object_store::Extensions` created from the `SessionConfig` extensions. ### Describe alternatives you've considered This is covered in https://github.com/apache/arrow-rs/issues/7155 and https://github.com/apache/arrow-rs/issues/7135. Basically, there are two alternative directions: * Update the `ObjectStore` API by providing optional trait methods that take an actual context type that can carry custom/extension data. * Considered by maintainers to be too heavy-handed. * Don't do anything. * This means for my use case, we wouldn't be able to properly parent tracing spans for object store accesses that happen during query execution. ### Additional context _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org