Dandandan opened a new issue, #195:
URL: https://github.com/apache/arrow-rs-object-store/issues/195

   **Is your feature request related to a problem or challenge? Please describe 
what you are trying to do.**
   While using the object store (S3 indirectly via DataFusion/Ballista) and 
using `get_range` requests, we can not stream the results but are required to 
fetch entire column chunks instead of being able to stream the data directly.
   
   There are some limitations by not providing such a method in the 
`ObjectStore`
   * Reducing memory usage/scalability by fetching / incrementally
   * Processing more incrementally instead of waiting for the entire range to 
be fetched
   * Being able to push down limits to every part of the query (e.g. row 
filters) and aborting early.
   
   **Describe the solution you'd like**
   Change the `get_range` implementation to return `GetResult` or introduce a 
new `get_range_stream` implementation, returning a stream, allowing to process 
/ stream incrementally.
   
   I checked this to be possible for S3, and I expect it to be for other 
implementations as well:
   ```rust
       async fn get_range_stream(&self, location: &Path, range: Range<usize>) 
-> Result<GetResult> {
           let stream = self
               .client
               .get_request(location, Some(range), false)
               .await?
               .bytes_stream()
               .map_err(|source| crate::Error::Generic {
                   store: "S3",
                   source: Box::new(source),
               })
               .boxed();
           Ok(GetResult::Stream(stream))
       }
   ```
   
   We can provide a default implementation based on `get_range` in the meantime 
for other object stores.
   
   **Describe alternatives you've considered**
   -->
   
   **Additional context**
   <!--
   Add any other context or screenshots about the feature request here.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to