viirya opened a new issue, #9458:
URL: https://github.com/apache/arrow-datafusion/issues/9458

   ### Describe the bug
   
   Currently `coalesce` function's return type is determined by a fixed list of 
`SUPPORTED_COALESCE_TYPES`. If the data type of input arguments are not in the 
list, DataFusion will coerce input arguments. It produces some weird results.
   
   For example, if the input arguments are `[Data32, Date32]`, because 
`SUPPORTED_COALESCE_TYPES ` doesn't include `Date32`, the return type of 
`coalesce` function will be `Utf8`.
   
   This doesn't look like correct based on the definition of `coalesce` 
function which should return first non-null value from the arguments. As the 
input arguments are same type, there is no need to apply a coerced type on them 
(`Utf8`).
   
   `coalesce` function simply checks null bits from input arrays,  and zip 
input arrays accordingly. DataFusion doesn't manipulate the arrays but relies 
on arrow-rs kernels like `and`, `zip`, etc. Looks like we don't need to limit 
the input types of `coalesce` to `SUPPORTED_COALESCE_TYPES`.
   
   The current approach is strange especially it determines the return type 
before coercing the input arguments. So we can construct a `coalesce` function 
with `[Data32, Date32]` inputs and its return type is `Utf8`. It is especially 
an issue for projects like Comet which only uses DataFusion physical plans, so 
we don't pass through the type coercion phrase of logical plan in DataFusion. 
So in Comet, when we construct a `coalesce` function with `[Data32, Date32]` 
inputs, the function will actually produce a `Date32` array but the schema is 
`Utf8`.
   
   I believe when we construct `coalesce` function, we should make sure its 
return type match its input types.
   
   
   
   
   
   
   
   
   
   ### To Reproduce
   
   _No response_
   
   ### Expected behavior
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to