yjshen commented on a change in pull request #1389:
URL: https://github.com/apache/arrow-rs/pull/1389#discussion_r820164755
##########
File path: parquet/src/file/serialized_reader.rs
##########
@@ -127,6 +127,56 @@ pub struct SerializedFileReader<R: ChunkReader> {
metadata: ParquetMetaData,
}
+/// A builder for [`ReadOptions`].
+/// For the predicates that are added to the builder,
+/// they will be chained using 'AND' to filter the row groups.
+pub struct ReadOptionsBuilder {
+ predicates: Vec<Box<dyn FnMut(&RowGroupMetaData, usize) -> bool>>,
+}
+
+impl ReadOptionsBuilder {
+ /// New builder
+ pub fn new() -> Self {
+ ReadOptionsBuilder { predicates: vec![] }
+ }
+
+ /// Add a predicate on row group metadata to the reading option,
+ /// Filter only row groups that match the predicate criteria
+ pub fn with_predicate(
+ mut self,
+ predicate: Box<dyn FnMut(&RowGroupMetaData, usize) -> bool>,
+ ) -> Self {
+ self.predicates.push(predicate);
+ self
+ }
+
+ /// Add a range predicate on filtering row groups if their midpoints are
within the range
Review comment:
Thanks, this is important. I've updated the doc.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]