neilconway opened a new pull request, #20426:
URL: https://github.com/apache/datafusion/pull/20426

   ## Which issue does this PR close?
   
   - Closes #15161.
   
   ## Rationale for this change
   
   In a comparison between a numeric column and a string literal (e.g., `WHERE 
int_col < '10'`), we previously coerced the numeric column to be a string type. 
This resulted in doing a lexicographic comparison, which results in incorrect 
query results.
   
   Instead, we split type coercion into two situations: type coercion for 
comparisons, where we want string->numeric coercion, and type coercion for 
places like UNION or IN lists, where DataFusion's traditional behavior has been 
to tolerate type mismatching by coercing values to strings.
   
   ## What changes are included in this PR?
   
   * Update `comparison_coercion` to coerce strings to numerics
   * Remove previous `comparison_coercion_numeric` function
   * Add a new function, `type_union_coercion`, and use it when appropriate 
(UNION, CASE, NVL, IN lists, and struct fields)
   * Add unit and SLT tests for new coercion behavior
   * Update existing SLT tests for changes in coercion behavior
   
   ## Are these changes tested?
   
   Yes. New tests added, existing tests pass. (TODO: ClickBench queries need to 
be fixed.)
   
   ## Are there any user-facing changes?
   
   Yes:
   
   * Comparing a numeric column to a string literal will now behave differently 
(the previous behavior was wrong)
   * Comparing a string column with a numeric literal will now attempt to cast 
the string column to the numeric type (instead of coercing the numeric literal 
to string and then doing lexicographic comparison); if the column contains any 
non-numeric values, the query will fail at runtime
   * Comparing a string column with a numeric column will now cast the string 
column to numeric, similar behavior to the previous case


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to