alamb commented on issue #18860:
URL: https://github.com/apache/datafusion/issues/18860#issuecomment-3563342071

   > > FWIW if the limit is pushed into the parquet reader, it will internally 
skip reading future row groups once the limit is reached. Here is some of the 
relevant code
   > 
   > If there's a filter, I think we still need to do row group pruning, then 
for the matched row groups, do row filters and get the limit rows.
   
   If there is a filter applied in the scan (via `pushdown_filters`), the 
parquet reader will stop (and not fetch any more row groups) once the limit is 
hit. 
   
   
https://github.com/apache/arrow-rs/blob/ed9efe78e4cc958cc96707557818e754419debb0/parquet/src/arrow/push_decoder/reader_builder/mod.rs#L504-L518
   
   I am probably not understanding what you are proposing. I'll try and read 
the PR
   
   > 
   > > What does "fully matches" /"partially matched" mean in this case? Does 
that mean all the rows in the row groups would be filtered?
   
   So fully matches means there are no rows that are filtered out -- make sensee


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to