alamb commented on code in PR #9118:
URL: https://github.com/apache/arrow-rs/pull/9118#discussion_r2729157033
##########
parquet/src/arrow/arrow_reader/selection.rs:
##########
@@ -800,10 +806,19 @@ impl MaskCursor {
let mut chunk_rows = 0;
let mut selected_rows = 0;
- // Advance until enough rows have been selected to satisfy the
batch size,
- // or until the mask is exhausted. This mirrors the behaviour of
the legacy
- // `RowSelector` queue-based iteration.
- while cursor < mask.len() && selected_rows < batch_size {
+ let max_chunk_rows = page_boundaries
Review Comment:
since the boundaries are all sorted we should be able to avoid this
sort/partition point...
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]