adriangb commented on issue #19092:
URL: https://github.com/apache/datafusion/issues/19092#issuecomment-3679165435

   Ok I've done a bit of digging.
   
   Firstly I apologize for making a confusing issue. I ran into this while 
working on a bigger change and wanted to document it to tackle later but 
evidently even though I included a lot of detail I didn't include the right 
detail.
   
   I'm still not sure what the original e2e reproduction I had for this was, 
but since I opened this on December 4th it's possible it was fixed by #19130 
(const simplifier) which was merged a couple days later, or by #19111 merged 
after that and changed the structure that @ShashidharM0118 points out in 
https://github.com/apache/datafusion/pull/19434#issuecomment-3678860994 would 
have also caused this issue.
   
   I did find something interesting: #19136 introduced a new opportunity for 
optimization via simplification. If we have a constant column say `a = 2` and 
the predicate `a is not null` we replace that to `2 is not null`. That will get 
simplified later for the scan but as far as I can tell it does *not* get 
simplified before being fed into `FilePruner`. `FilePruner` also does not 
simplify the output of any dynamic filters. So I think if we added a simplifier 
pass in `FilePruner` we'd get some extra pruning in the case of `constant_col 
is null`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to