[GitHub] [arrow] westonpace commented on pull request #11486: ARROW-12683 [C++] Enable fine-grained I/O (coalescing) in IPC reader

GitBox Thu, 28 Oct 2021 19:17:25 -0700


westonpace commented on pull request #11486:
URL: https://github.com/apache/arrow/pull/11486#issuecomment-954355732



   > ArrayLoader involves quite a lot arrow structures, and I am not familiar 
with some of them, so I try to follow current organization to make it work so 
far.
   
   Ok.  That is fine.  Thank you for considering.
   
   > I think probably we can close ARROW-12683 and I will create a JIRA issue 
to track the async version of the reader enhancement as follow-up. What do you 
think?
   
   Sounds great.
   
   > In my test under Linux, I found Linux will do read ahead IO...
   
   I did some testing with `POSIX_FADV_WILLNEED` and didn't ever see much 
benefit over Linux's builtin readahead.
   
   > I don't look into how S3FileSystem handles this
   
   It does not currently handle this.  We get pretty poor performance with the 
IPC reader on S3 because there is no readahead / batching (and there is a high 
latency per request).  Handling this at the filesystem level is an interesting 
thought.  The challenge will be that the filesystem is parallel so we sometimes 
want to allow multiple reads (instead of queuing and plugging/merging) but the 
filesystem doesn't know the access pattern.  Maybe we can still come up with a 
good strategy.  We have ARROW-14429 for this already so no need to solve this 
problem right now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] westonpace commented on pull request #11486: ARROW-12683 [C++] Enable fine-grained I/O (coalescing) in IPC reader

Reply via email to