notashes commented on issue #20324:
URL: https://github.com/apache/datafusion/issues/20324#issuecomment-3914037886

   > I am wondering if for those queries, a conservative heuristic would be 
also to always put dynamic filters after the static filters (regardless of 
column size), so the overhead of pushing down bad dynamic filters won't be as 
bad. It might regress some good TopK predicates though.
   
   I actually worked out a solution where instead of putting dynamic filters 
after static ones we can simply defer the expensive string static predicate out 
of `RowFilter` entirely. It would also avoid regressing good TopK predicates. 
The dynamic filter stays and converges on cheap `EventTime` column and gets to 
prune rows.
   
   it's a pretty narrow heuristics as in it only defers `col != literal` on 
`string/binary` but it shows improvement over both baseline with pushdown off 
and on. 
   
   Q24: baseline off `~0.266s`, baseline on `~0.299s` (the regression), with my 
patch `~0.197s`. 
   
   @Dandandan would appreciate your thoughts on this #20413 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to