zhuqi-lucas commented on code in PR #10158:
URL: https://github.com/apache/arrow-rs/pull/10158#discussion_r3457002178
##########
parquet/src/arrow/push_decoder/remaining.rs:
##########
@@ -93,6 +93,63 @@ impl RowGroupFrontier {
self.budget = budget;
}
+ /// Peek at the next row-group index `next_readable_row_group` would
+ /// hand out, without mutating any state. Returns `None` if every
+ /// remaining row group would be skipped under the current
+ /// selection/budget, or if the queue is empty.
+ ///
+ /// Mirrors the structure of `next_readable_row_group` but only walks
+ /// borrowed state — used by
[`super::ParquetPushDecoder::peek_next_row_group`]
+ /// to let adaptive callers (e.g. dynamic row-group pruners or per-RG
+ /// `RowFilter` toggles) keep their per-RG state in lock-step with
+ /// the reader the decoder is about to emit.
+ fn peek_next_row_group(&self) -> Result<Option<usize>, ParquetError> {
+ // Short-circuit: budget exhausted or selection drained ⇒ same
+ // outcome as `next_readable_row_group`'s early return.
+ if self.budget.is_exhausted()
+ || self
+ .selection
+ .as_ref()
+ .is_some_and(|selection| selection.row_count() == 0)
+ {
+ return Ok(None);
+ }
+
+ // We may have to walk past row groups whose split selection is
+ // empty. Cloning the selection lets us run the same `split_off`
+ // logic without disturbing the real one.
+ let mut selection = self.selection.clone();
Review Comment:
Good point!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]