YjyJeff commented on PR #6862:
URL: 
https://github.com/apache/arrow-datafusion/pull/6862#issuecomment-1623463554

   I have not created the issue, should we create an issue to talk about it?
   
   I found this problem in our custom optimizer. In our optimizer, we will 
change the return type of the expression. For example, if we have the query 
   ```
   select col_0 / 3 as a, count(1) from table group by a 
   ```
   and the type of `col_0` is `DataType::UInt32`. Without our custom optimizer, 
the expression `col_0 / 3` will return the type `DataType::Int64`(because the 
literal `3` is parsed as `Int64`).  Our optimizer will convert the literal 
`Int64(3)` to `UInt(32)` such that we can avoid the cost of casting `col_0` to 
`Int64Array`. In this case, the return type of the `group by expression` 
changed from `DataType::Int64` to `DataType::UInt32`. 
   
   
   > It's legacy code, exists with some reason. If there are no specific 
reasons, we should not change it.
   
   What are the reasons? In my view, `from_plan` may change the schema, 
therefore, the legacy code is a bug here. My patch fixed the bug  and have 
passed all of the tests
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to