yordan-pavlov commented on a change in pull request #9064: URL: https://github.com/apache/arrow/pull/9064#discussion_r552147401
########## File path: rust/parquet/src/file/serialized_reader.rs ########## @@ -137,6 +137,22 @@ impl<R: 'static + ChunkReader> SerializedFileReader<R> { metadata, }) } + + pub fn filter_row_groups( Review comment: the second option, with the `ParquetReadOptions` parameter, sounds better (compared to the `new_with_metadata` method) - more extensible as you have described; however I think this falls outside of the scope of this PR; one issue I can think of, though, is that the code needs to read the statistics metadata from the parquet file, in order create the statistics record batch, execute the predicate expression on it, and then use the results to filter the parquet row groups; this could still work, if the parquet metadata can be read before `SerializedFileReader` is crated using the proposed constructor ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org