tustvold commented on issue #386:
URL: 
https://github.com/apache/arrow-rs-object-store/issues/386#issuecomment-2919233629

   > are "active" shouldn't be timing out / being retried in the first place 
and we should consider some other mechanism
   
   I agree, IMO the intent of the retry machinery is to recover from transient 
errors, e.g. network interruption, not to allow operations to complete that 
would fail in the absence of retries. If a request is so large that it could 
never be completed in a single get operation, it needs to be broken up.
   
   What does this chunking is potentially up for debate, I personally think we 
should add a TransferManager component that can automatically split up each of 
potentially multiple downloads into multiple concurrent parts. This is inline 
with the design goal of ObjectStore mirroring the object store APIs, and 
mimising the "magic" that makes reasoning about IO very difficult, whilst 
providing higher-level abstractions where applicable. 
   
   It has also been suggested that we build this into ObjectStore::get 
directly, perhaps as a wrapper, but I'm sceptical that this can be done 
completely transparently. It'll be a complex tradeoff between memory, network 
throughput, and cost, with relatively limited context on which to make this 
judgement.
   
   FYI @crepererum 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to