adriangb commented on issue #20324:
URL: https://github.com/apache/datafusion/issues/20324#issuecomment-3905154744

   I do get your point. TPCH / TPCDS will essentially not use late 
materialization off/ `RowFilter` because like you say all files are opened at 
once.
   
   > Because a disabled filter now always returns "true" it scans the column 
while no longer contributing to making the selection smaller
   
   I assume by disabled filter you mean the cases where completely discard a 
DynamicFilterPhysicalExpr? I don’t think those should evaluate to `true`, I 
think they should ideally be completely removed. 
https://github.com/apache/datafusion/pull/20160 does not do that (as you say it 
returns `true`; side note: I wonder if we can optimize all true / all false 
masks). I am trying to address that in 
https://github.com/apache/datafusion/pull/20363 which essentially implements 
the suggestion above of “ It might be possible to merge the efforts if we e.g. 
add PhysicalExpr::is_discardable_filter() -> bool or something, then the more 
general adaptive selectivity machinery can choose to discard the filter instead 
of just putting it last”


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to