alamb commented on PR #13795:
URL: https://github.com/apache/datafusion/pull/13795#issuecomment-2548414526

   I thought more about this in the 🚿  and doubled checked the logic:
   
   TLDR is I think using `NOT` (as in this PR) is ok -- the rationale is that 
if evaluating `column_count = null_count` is null it means nothing is known 
about the null_counts. However, since `null AND ...` will still resolve to 
`false` if the `...` is false (aka we can prove the predicate is not true by 
other means), then the requirements of the pruning predicate will be satisfied
   
   
https://github.com/apache/datafusion/blob/e665115893e6282d592df71657e9f5b5855d1617/datafusion/physical-optimizer/src/pruning.rs#L723-L726
   
   So TLDR is upon more thought I think the theory behind this PR is sound ✅ 
   
   I need to review the code more carefully and ensure we have a test that has 
unknown (unspecified) column count but the value can be proven true by other 
min/max ranges but otherwise it should be good to go.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to