neilconway commented on PR #20426: URL: https://github.com/apache/datafusion/pull/20426#issuecomment-3924335597
One example of a weird behavior that we get with the new coercion rules proposed here: ``` > create table t1 (a text, b int, c float); 0 row(s) fetched. Elapsed 0.018 seconds. > explain select * from t1 where a < 5; +---------------+-------------------------------+ | plan_type | plan | +---------------+-------------------------------+ | physical_plan | ┌───────────────────────────┐ | | | │ FilterExec │ | | | │ -------------------- │ | | | │ predicate: │ | | | │ CAST(a AS Int64) < 5 │ | | | └─────────────┬─────────────┘ | | | ┌─────────────┴─────────────┐ | | | │ DataSourceExec │ | | | │ -------------------- │ | | | │ bytes: 0 │ | | | │ format: memory │ | | | │ rows: 0 │ | | | └───────────────────────────┘ | | | | +---------------+-------------------------------+ 1 row(s) fetched. Elapsed 0.016 seconds. ``` This follows from the "coerce string to numeric" change in general, but it's a bit weird. I think we should probably reject queries like this entirely, on the basis of a type mismatch. If we did that and nothing else, we'd break a ton of queries (e.g., anything comparing a numeric column with a string literal). But we could fix that up by treating comparisons with string literals specially (e.g., "if x OP y mismatch in types, check if either x or y is a string literal; if so, try to coerce it to a numeric type"). So that's an alternative approach worth considering. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
