alamb commented on code in PR #5848:
URL: https://github.com/apache/arrow-rs/pull/5848#discussion_r1629275132
##########
parquet/src/arrow/arrow_reader/selection.rs:
##########
@@ -111,13 +118,13 @@ impl RowSelection {
SlicesIterator::new(filter).map(move |(start, end)| start +
offset..end + offset)
});
- Self::from_consecutive_ranges(iter, total_rows)
+ Self::from_consecutive_ranges(iter, Some(total_rows))
}
/// Creates a [`RowSelection`] from an iterator of consecutive ranges to
keep
- pub(crate) fn from_consecutive_ranges<I: Iterator<Item = Range<usize>>>(
+ pub fn from_consecutive_ranges<I: Iterator<Item = Range<usize>>>(
ranges: I,
- total_rows: usize,
+ total_rows_opt: Option<usize>,
Review Comment:
I personally think that making `total_rows` optional will make it very hard
to use this API correctly. Specifically, the `total_rows1 almost always needs
to be necessary, otherwise it will not be possible to represent skipping the
last rows in the selection
For example, a range of `100..200` for `400` total rows needs to look ike
```
Skip(100)
Scan(100)
Skip(100). <--- you can't figure this out from the ranges, you need the
total row count
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]