friendlymatthew commented on issue #8668:
URL: https://github.com/apache/arrow-rs/issues/8668#issuecomment-3423840709

   Hi hi, I think the push based decoder looks super cool. I had a couple of 
questions: 
   
   1. Peek behavior
   I'm trying to understand when `peek()` should return ranges. Should it only 
return ranges when the decoder is in `WaitingOnFilterData` or `WaitingOnData` 
states? For other states (like `Start`, `Filters`, or `StartData`), it would 
return empty because determining needed ranges would require running the state 
machine
   
   This form of behavior would look something like: 
   
   ```rs
   loop {
       // Peek returns ranges only when decoder is blocked waiting for data
       let ranges = decoder.peek().take(32).collect::<Vec<_>>();
   
       if !ranges.is_empty() {
           fetch_and_push(ranges);
       }
   
       match decoder.try_decode() {
           DecodeResult::NeedsData(_) => { /* already handled above */ }
           DecodeResult::Data(batch) => process(batch),
           DecodeResult::Finished => break,
       }
   }
   ```
   
   Or should `peek()` try to look ahead and predict future ranges (which would 
require simulating state transitions)
   
   
   2. API simplification
   
   `PeekResult` has 2 variants (`Range(Range<u64>)` and `End)`, which is 
basically reinventing `Option<Range<u64>>`. I wonder if we should just simplify 
`peek` to: 
   
   ```rs
   fn peek(&self) -> impl Iterator<Item = Range<u64>>
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to