jayzhan211 commented on issue #8506: URL: https://github.com/apache/arrow-datafusion/issues/8506#issuecomment-1879913865
I think there is a little issue with the current design. We rewrite the `| |` operator to function after the logical plan is built. Before OperatorToFunction is applied, we need to build logical plan, which calls `projection_schema`. It calls `exprlist_to_fields` to build schema. Inside `to_field`, we calculate the return type of string concat, which does type coercion. The problem is that the return type calculated here is useless, because we rewrite it to function afterward. The type coercion for string concat is done inside string_coercion, btw, it is incorrect. https://github.com/apache/arrow-datafusion/blob/dd4263f843e093c807d63edf73a571b1ba2669b5/datafusion/expr/src/type_coercion/binary.rs#L704C9-L708 `select [1,2,3] || 4.1` return List(Int64), but it should be List(F64) instead. However, since it is rewritten to Function, so the output is still correct at the end. If we need to avoid the calculation (mainly type coercion) here, we may need an if else branch to skip the schema building for the operators that will be rewritten afterward. Maybe code inside `projection_schema`, return empty schema if we will rewrite the exprs. @alamb @viirya how do you think? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
