crepererum opened a new issue, #385:
URL: https://github.com/apache/arrow-rs-object-store/issues/385

   # Abstract
   The `ObjectStore` trait -- as designed currently -- is a middle ground of 
somewhat competing design goals. I think we can do better than that.
   
   # Requirements
   The trait serves two groups of API users:
   
   ## Object Store Users
   Humans and machines that want to use an object store through a unified 
interface.
   
   They usually like to write only the relevant parts, e.g.:
   
   - "get object `foo/bar.txt`" ==> `store.get("foo/bar.txt").await?`
   - "get first 10 bytes of `foo/bar.txt`" ==> 
`store.get("foo/bar.txt").with_range(..10).await?`
   
   ## Object Store Implementations
   Parties that implement a new object store or wrap an existing one. They 
usually want to implement the methods that determine the behavior of the object 
store, ideally without surprises (like opinionated default implementations).
   
   # Status Quo
   Due to this dual nature the trait has accumulated an increasing number of 
methods, a bunch of them with default implementation. Let's have a look at the 
"get" methods:
   
   
https://github.com/apache/arrow-rs-object-store/blob/0c3152c709d5101bc2346c49fff5c94e033b8e71/src/lib.rs#L633-L662
   
   All except for `get_ranges` basically map to `get_opts` and there is no 
reason for a store to override any of the defaults. And even  for `get_ranges` 
we could come up with a sensible mapping to `get_opts` if the range parameter 
would support multi-ranges similar to HTTP.
   
   Now let's look at "rename":
   
   
https://github.com/apache/arrow-rs-object-store/blob/0c3152c709d5101bc2346c49fff5c94e033b8e71/src/lib.rs#L773-L782
   
   
https://github.com/apache/arrow-rs-object-store/blob/0c3152c709d5101bc2346c49fff5c94e033b8e71/src/lib.rs#L793-L799
   
   I think it is out of question that a store implementation should definitely 
override these if there's any way to perform key renames without a full "get + 
put".
   
   # Proposal
   I propose to remove all default implementations from the trait and only have 
a single method per operation, so "get" would look like this:
   
   ```rust
   async fn get(&self, location: &Path, options: GetOptions) -> 
Result<GetResult>;
   ```
   
   Note that the `location` is NOT part of `options` so that `GetOptions` can 
still implement `Default` and a user only needs to specify the options of 
interest:
   
   ```rust
   // get bytes of file
   store.get(
        &location,
        Default::default(),
   ).await?.bytes().await?;
   
   // get range
   store.get(
        &location,
        GetOptions{
                range: (..10).into(),
                ..Default::default()
        },
   ).await?.bytes().await?;
   ```
   
   A similar mapping can be done for "rename".
   
   I think that strikes a good balance between boilerplate / verbosity of the 
user API and the clarity of the interface.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to