alamb commented on PR #9697:
URL: https://github.com/apache/arrow-rs/pull/9697#issuecomment-4260500992

   To summarize your use case, @HippoBaro, in my own words:
   
     1. You want to control prefetching / coalescing of IO to optimize for your 
environment, especially object-store
     access.
     2. You want to know when previously prefetched data will no longer be 
needed so its resources can be freed.
     3. For very wide schemas, you want buffer management to scale efficiently, 
not quadratically with the number of
     columns.
   
     What seems hard about the current setup is:
   
     1. The push decoder knows which ranges it explicitly requested, so it can 
reason about those.
     2. If an external system pushes data at a different granularity, such as 
speculative prefetch or coalesced
     reads, the decoder does not really know what data may still be needed 
later versus what can safely be released.
   
     My main concern with this PR is that it adds a specific IO buffer 
management policy for one usage pattern into `ParquetPushDecoder` . That may 
work well for your object-store case, but it may not generalize as well to 
other environments or IO strategies (I am thinking io_uring for example)
   
   Thus my preference would be to keep buffer management policy above 
`ParquetPushDecoder`, and make sure arrow-rs exposes the primitives needed to 
support it.
   
     For example:
   
     ```rust
     let decoder = make_decoder();
   
     // Plan and prefetch what is likely needed for row group 1
     let row_group1_bytes = /* calculate likely bytes needed for row group 1 */;
     let prefetched_data = fancy_prefetcher.get(row_group1_bytes);
     decoder.push_data(prefetched_data);
   
     decoder.decode(); // ideally does not need more data until row group 2
   
     // Once row group 1 is fully consumed, release any staged buffered data
     decoder.release_all();
   
     // Move on to row group 2
     let row_group2_bytes = /* calculate likely bytes needed for row group 2 */;
     let prefetched_data = fancy_prefetcher.get(row_group2_bytes);
     decoder.push_data(prefetched_data);
   
     decoder.decode(); // ideally does not need more data until row group 3
   ```
   
     From that perspective, I think we are close to having the right pieces 
already. What still seems missing is:
   
     1. A reliable way to know when the push decoder has consumed everything it 
will need from previously pushed data.
     2. (Maybe) an easier API to calculate likely byte ranges for a given set 
of row groups / columns. I think it can be derived from the metadata APIs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to