tustvold commented on PR #2677:
URL: 
https://github.com/apache/arrow-datafusion/pull/2677#issuecomment-1170942273

   > I am not clear about the whereas master interleaves the IO and decoding i 
think master use block IO, decode must wait for IO. this patch uses 
interleaving with async function to reduce the blocked IO.
   
   Master interleaves IO at the page level, reading individual pages as 
required blocking the calling thread as it does so. This branch instead 
performs async IO fetching column chunks into memory without blocking threads, 
this is significantly better for object stores, but will perform "worse" for 
certain workloads accessing local files where the approach on master may be 
faster, but with the obvious drawback of blocking threads.
   
   > if we first integrated the object store abstraction into the repository.
   
   I would be fine waiting until the donation to arrow-rs goes through 
(https://github.com/influxdata/object_store_rs/issues/41) but I had hoped that 
given this intent had been clearly broadcast, rather than waiting the 3 or so 
weeks it will take to go through this process, we could just get this in. What 
do you think?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to