adriangb commented on issue #11262:
URL: https://github.com/apache/datafusion/issues/11262#issuecomment-4481598087

   @neilconway I wonder what you were thinking of tackling?
   
   IMO the version of this that #22144 is trying to tackle necessarily needs to 
live external to the expression itself. It's much more tightly wound with the 
parquet scan itself because we aren't just deciding the order in which filters 
are evaluated but also where the IO happens (eager vs. late materialization).
   
   It could still make sense to have `BinaryExpr` do reordering itself. For 
example we are not going to split / track `id = 1 OR message ilike '%foo%'` as 
separate filters during scans. And there are still `FilterExec`s, join 
conditions, other places where binary expressions are used.
   
   I was also wondering if a "flattened" binary expression that can do a better 
job of re-using buffers, etc. would make sense. It seems necessary to do some 
sort of re-ordering sanely (otherwise you have to re-build the expression tree 
which would be hard once the query is executing).
   
   Another issue that appears is the "what is an expression anyway" issue. 
Various places remap children / rewrite the expression in ways. It's not always 
clear when the expression is the same (should share selectivity tracking) or 
not.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to