tustvold commented on issue #6946:
URL: https://github.com/apache/arrow-rs/issues/6946#issuecomment-2574893357

   I think this boils down to what the characteristics of the backing store are:
   
   **Memory**
   
   Data is already buffered in memory it doesn't matter what approach we use.
   
   **Object Storage**
   
   Object stores do not support true vectorized I/O, that is you can't fetch 
disjoint ranges in a single request. This means that to fetch 5 pages from each 
column, as in the intermediate buffering approach, it will actually perform one 
request per column. _Unless the column chunk is small in which case it may just 
fetch the entire column chunk, and then chuck the extra pages away, which is 
even worse._ These requests are made in parallel, but this does have 
potentially significant cost implications. 
   
   I'd posit that the current approach is the right one for object stores.
   
   **Local SSD**
   
   Local SSD is the option where this gets potentially interesting, as 
something closer to the minimize buffering approach becomes viable. IMO this is 
where something like https://github.com/apache/arrow-rs/issues/5522 becomes 
more relevant, especially if wanting to use something like io_uring.
   
   **Alternatives**
   
   A simpler option might just be to write files with smaller row groups, this 
effectively "materializes" the intermediate buffering approach into the file, 
but with the added benefit that it doesn't break IO coalescing. This is 
effectively the observation made by file formats like Lance when they got rid 
of the column chunk.
   
   There is a slight real difference between one row group with 50 pages and 5 
row groups with 10 pages each, concerning dictionary pages, but based on what I 
guess is motivating this request, this may be a better option
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to