gruuya opened a new pull request, #11386:
URL: https://github.com/apache/datafusion/pull/11386
## Which issue does this PR close?
Closes #11385.
## Rationale for this change
Investigating the above issue led me to identify a couple of aspects that
need to align in order for the bug to manifest:
- there's a (grouping) aggregation
- of a >2 way union
- with some type coercion
In particular here's a minimal repro of the above issue
```sql
> select c1, sum(c2) as sum_c2
from (select 1 as c1, 1::int as c2
union
select 2 as c1, 2::int as c2
union
select 3 as c1, coalesce(3::int, 0) as c2)
group by c1;
External error: External error: External error: Arrow error: Invalid
argument error: RowConverter column schema mismatch, expected Int32 got Int64
```
What happens is that the nested union elimination unwraps the first two
child plans and coerces their schema, however the remaining plan isn't being
coerced. Upon physical planning Union inherits the schema of the first child
plan.
Consequently during execution, the RowConverter gets instantiated with
`Int32` type, whereas the last child will produce `Int64` elements, since
`coalesce` enforces type coercion to align the left element with the right one
(represented as `Int64(0)`)
## What changes are included in this PR?
Coerce all child plans of the outer union as per it's schema, not only plans
in the inner union.
## Are these changes tested?
There's a new SLT.
## Are there any user-facing changes?
No error in TPC-DS Q75
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]