jonahgao commented on PR #14223: URL: https://github.com/apache/datafusion/pull/14223#issuecomment-2611707619
> When we combine a `UInt64` with a literal value (say `5`), we don't accidentally consider the literal value a signed value and do this coercion to decimal, right? That would kill performance in many cases. We always parse `5` as int64 (see [parse_sql_number](https://github.com/apache/datafusion/blob/a534e853e88365b67363b26a004f6037c2f3a5b0/datafusion/sql/src/expr/value.rs#L80)), so this coercion result will be decimal. The same thing also happens in DuckDB. ``` v1.1.1-dev319 af39bd0dcf D create table t(a ubigint); D select * from t union select 5; ┌────────┐ │ a │ │ int128 │ ├────────┤ │ 5 │ └────────┘ ``` This is because, during type coercion, we only consider the data types of the binary operands and do not take their actual data into account. Some binary operations, such as union and comparisons, require that the operands have the same data type, so the coerced type must be a superset of both operand types and therefore can accommodate all possible operand values. For this case, maybe we can use statistical information to examine the operands' data and perform some physical optimizations. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org