xudong963 commented on PR #9694: URL: https://github.com/apache/arrow-rs/pull/9694#issuecomment-4323946280
> Yeah, I was thinking maybe there was some way to take some slice of decoder rows off. However, perhaps https://docs.rs/parquet/58.0.0/parquet/arrow/push_decoder/struct.ParquetPushDecoder.html#method.try_next_reader ks all we need 🤔 @alamb Yeah, I think `try_next_reader` could work for that too. The approach I ended up with on the DF side ( https://github.com/apache/datafusion/pull/21637) is slightly different — instead of using a single decoder and `try_next_reader`, I split the row groups into consecutive runs that share the same filter requirement and create a separate `ParquetPushDecoder` per run. Fully matched runs get no RowFilter, filtered runs get one. The stream chains them in order. Either way, no arrow-rs API changes are needed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
