kosiew opened a new pull request, #17084:
URL: https://github.com/apache/datafusion/pull/17084

   ## Which issue does this PR close?
   
   * Closes #16991
   
   ## Rationale for this change
   
   This PR introduces the concept of a [virtual object store that allows paths 
to be routed to different underlying 
`ObjectStore`](https://github.com/apache/datafusion/issues/16991#issuecomment-3141439401)
 implementations based on their prefix. This enables use cases such as 
combining local, S3, in-memory, or custom stores under a unified interface, 
which is particularly useful in multi-source query environments or testing 
scenarios.
   
   ## What changes are included in this PR?
   
   * Introduced `VirtualObjectStore` in 
`datafusion/execution/src/virtual_object_store.rs`
   * Added support for routing based on the first path segment (prefix)
   * Integrated optional `virtual_store` into `FileScanConfig` and its builder
   * Updated `FileScanConfig` to resolve the correct object store dynamically 
based on the presence of `virtual_store`
   * Added new dependencies: `async-trait` and `tokio`
   * Comprehensive test coverage for the new virtual store implementation, 
including edge cases and multipart operations
   
   ## Are these changes tested?
   
   Yes. A comprehensive suite of unit tests has been added in 
`virtual_object_store.rs`, covering:
   
   * Basic routing
   * Prefix resolution
   * Multipart upload and abort
   * List operations with/without delimiters
   * Error handling for missing prefixes
   
   ## Are there any user-facing changes?
   
   Yes:
   
   * Users of `FileScanConfigBuilder` can now optionally provide a 
`virtual_store` via the `with_virtual_store` method.
   * File access behavior may change when `virtual_store` is supplied, as it 
overrides the default `object_store_url`-based resolution.
   
   These changes are additive and backwards compatible.
   
   ---
   
   ✅ No breaking changes
   ✅ Fully tested
   ✅ Improves composability and flexibility of DataFusion's object store 
handling
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to