vitoordaz opened a new issue, #675:
URL: https://github.com/apache/arrow-rs-object-store/issues/675

   **Which part is this question about**
   API
   
   **Describe your question**
   ArrowObjectStore provides a very strong abstraction over object storage and 
solves many common problems well.
   
   However, there are cases where leveraging features of the underlying storage 
API could lead to meaningful performance improvements. Currently, these 
capabilities are not exposed, which forces users into less efficient patterns.
   
   Two concrete examples:
   
   1. **Ordered list results**
   Today, users of the ArrowObjectStore list API must assume results are 
unordered, requiring them to fetch all items and sort them in memory.
   However, some backends (e.g., Amazon S3 for non-directory buckets) already 
return results in lexicographical order. If users could detect this at runtime, 
they could avoid unnecessary buffering and sorting, and only fall back when 
ordering is not guaranteed.
   
   2. **Negative byte ranges**
   Fetching ranges from the end of an object currently requires an additional 
HEAD request to determine object size before issuing a GET.
   If the underlying store supports negative ranges directly, this extra 
request could be avoided, reducing latency and request overhead.
   
   **Proposal**
   Would the ArrowObjectStore maintainers be open to introducing a 
"capabilities" API that exposes which features are supported by the underlying 
storage implementation?
   
   For example, something along the lines of:
   
   * supports_ordered_listing()
   * supports_negative_ranges()
   
   This would allow users to write adaptive logic that takes advantage of 
backend-specific optimizations while preserving portability.
   
   Curious to hear your thoughts on whether this aligns with the project's 
design goals.
   
   **Additional context**
   <!--
   Add any other context about the problem here.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to