bellwether-softworks opened a new issue, #6743:
URL: https://github.com/apache/arrow-datafusion/issues/6743

   ### Describe the bug
   
   `ARRAY_AGG` usage in SQL results in an error when used in conjunction with 
`DISTINCT` parameter alongside another aggregate field.
   
   ### To Reproduce
   
   1. Establish example data:
     ```sql
     CREATE TABLE example(id INT, parent_id INT, tag VARCHAR) AS VALUES
         (1, 0, 'bob'),
         (2, 0, 'cat'),
         (3, 1, 'tom'),
         (4, 1, 'cat'),
         (5, 1, 'tom');
     ```
   2. Execute query using `ARRAY_AGG` and `DISTINCT` parameter:
     ```sql
     SELECT
             parent_id,
             COUNT(id) AS count_of,
             ARRAY_AGG(DISTINCT tag) AS tags
         FROM example
         GROUP BY parent_id;
     ```
   
   Executing the above results in the following message:
   ```
   ArrowError(ExternalError(Internal("Inconsistent types in 
ScalarValue::iter_to_array. Expected Utf8, got List([tom,cat])")))
   ```
   
   ### Expected behavior
   
   Desired output should be similar to the following:
   
   | parent_id | count_of | tags            |
   |-----------|----------|-----------------|
   | 1         | 3        | [tom, cat] |
   | 0         | 2        | [bob, cat]      |
   
   
   ### Additional context
   
   Omitting either the `COUNT` field, or the `DISTINCT` clause in `ARRAY_AGG`, 
allows the query to complete successfully.
   
   The above was initially discovered in v17.0.0 and verified to still be 
presenting in v26.0.0.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to