jayzhan211 commented on issue #8506:
URL: 
https://github.com/apache/arrow-datafusion/issues/8506#issuecomment-1879913865

   I think there is a little issue with the current design.
   We rewrite the `| |` operator to function after the logical plan is built. 
Before OperatorToFunction is applied, we need to build logical plan, which 
calls `projection_schema`. It calls `exprlist_to_fields` to build schema. 
Inside `to_field`, we calculate the return type of string concat, which does 
type coercion. The problem is that the return type calculated here is useless, 
because we rewrite it to function afterward. 
   
   The type coercion for string concat is done inside string_coercion, btw, it 
is incorrect.
   
https://github.com/apache/arrow-datafusion/blob/dd4263f843e093c807d63edf73a571b1ba2669b5/datafusion/expr/src/type_coercion/binary.rs#L704C9-L708
   
   `select [1,2,3] || 4.1` return List(Int64), but it should be List(F64) 
instead. However, since it is rewritten to Function, so the output is still 
correct at the end.
   
   If we need to avoid the calculation (mainly type coercion) here, we may need 
an if else branch to skip the schema building for the operators that will be 
rewritten afterward. Maybe code inside `projection_schema`, return empty schema 
if we will rewrite the exprs.
   
   @alamb @viirya how do you think?  
   
   
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to