EmilyMatt commented on issue #16991: URL: https://github.com/apache/datafusion/issues/16991#issuecomment-3149437149
> Thank you for bringing this up [@EmilyMatt](https://github.com/EmilyMatt) > > I think there is currently an assumption in the `ListingTable` (and all the way down to the DataSourceExec) that all the files are read from the same underlying `ObjectStore` instance which I think is the root cause of the challenge > > It would be pretty disruptive (aka a big and complicated PR) I think to try and wire in support for multiple object stores > > Another idea I had work for you could be to make a "virtual" ObjectStore wrapper > > Something like this (a sketch, not compiling): > > struct VirtualObjectStore { > // Maps the first element of each path to a different ObjectStore > stores: HashMap<String, Arc<dyn ObjectStore>>. > } > > impl ObjectStore for VirtualObjectStore { > // delegates to the correct store > // for example, > // get '/store1/my_data/1.parquet' > // would be mapped to a get call to `store1` at path `/my_data/1.parquet` > fn get(&self, path: Path) -> ... { > let store = path[0]; // first part of the path > let real_path = path[1..]; // remainder of the path > // delegate to the inner store > self.stores.get(store).get(real_path) > } > ... > } Thank you for the suggestion, this is similar to what I'm working on :) It just requires some micromanagement, thought i'd raise a flag on this^ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org