bellwether-softworks opened a new issue, #6743:
URL: https://github.com/apache/arrow-datafusion/issues/6743
### Describe the bug
`ARRAY_AGG` usage in SQL results in an error when used in conjunction with
`DISTINCT` parameter alongside another aggregate field.
### To Reproduce
1. Establish example data:
```sql
CREATE TABLE example(id INT, parent_id INT, tag VARCHAR) AS VALUES
(1, 0, 'bob'),
(2, 0, 'cat'),
(3, 1, 'tom'),
(4, 1, 'cat'),
(5, 1, 'tom');
```
2. Execute query using `ARRAY_AGG` and `DISTINCT` parameter:
```sql
SELECT
parent_id,
COUNT(id) AS count_of,
ARRAY_AGG(DISTINCT tag) AS tags
FROM example
GROUP BY parent_id;
```
Executing the above results in the following message:
```
ArrowError(ExternalError(Internal("Inconsistent types in
ScalarValue::iter_to_array. Expected Utf8, got List([tom,cat])")))
```
### Expected behavior
Desired output should be similar to the following:
| parent_id | count_of | tags |
|-----------|----------|-----------------|
| 1 | 3 | [tom, cat] |
| 0 | 2 | [bob, cat] |
### Additional context
Omitting either the `COUNT` field, or the `DISTINCT` clause in `ARRAY_AGG`,
allows the query to complete successfully.
The above was initially discovered in v17.0.0 and verified to still be
presenting in v26.0.0.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]