jayzhan211 commented on PR #14223:
URL: https://github.com/apache/datafusion/pull/14223#issuecomment-2613874994

   For the scalar case, like `SELECT a > -1`, consider any `SELECT a > b` where 
b is constant. I think we could optimize it since we know the value. If the 
scalar is negative, it can be simplified to `true`. Otherwise, we can rewrite 
the expression as `min(a, i64::max) > b OR (min(a, i64::max) = i64::max AND b = 
i64::max)`.
   
   However, for the column case, like `SELECT a > b`, we can only rewrite it to 
the latter form.
   
   I believe similar optimization rules exist for other comparison operators, 
given that the type implies the range of the values.
   
   Considering COALESCE and UNION, for the scalar case, we can rewrite the 
expression given the known value. But for the column case, I think Decimal128 
is the only viable option
   
   I quite agree this should be handled in optimizer in general like 
`unwrap_cast_in_comparison` or physical optimizer rule that based on column 
statistics. 
   
   Another question I would like to know is whether the u64+i64 combination is 
common in DataFusion? And whether we can avoid this at all. I guess u64 that is 
larger than i64::max is uncommon, can we aggressively use i64 even though we 
know it is always positive?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to