niyue opened a new pull request #11486:
URL: https://github.com/apache/arrow/pull/11486


   This PR tries to fix https://issues.apache.org/jira/browse/ARROW-12683 
([C++] Enable fine-grained I/O (coalescing) in IPC reader)
   
   This is my first PR for arrow, please forgive my ignorance and let me know 
the issues for code format/convention/etc. 
   And probably I chose a wrong issue as the first problem I want to contribute 
since after investigating this issue for a while, I realize it is more 
difficult than I expected :(
   
   Currently I chose an approach that can re-use the current code as much as 
possible in `ArrayLoader`, to do that, I use a no-op random access file to 
record the IO and replay only the necessary read operation later. But I am not 
certain if this is the best approach for solving this issue, and if this kind 
of approach doesn't fit, feel free to reject this PR, and please let me know 
how this should be done and I can give it another try.
   
   Besides passing the unit tests, I verified the IO behavior under Linux 
manually by watching the file pages loaded in page cache, and it works largely 
as I expected, and the IO saving varies depending on the specific field to be 
accessed.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to