Lauri Koobas created SPARK-34714:
------------------------------------

             Summary: collect_list(struct()) fails when used with GROUP BY
                 Key: SPARK-34714
                 URL: https://issues.apache.org/jira/browse/SPARK-34714
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 3.1.1
         Environment: Databricks Runtime 8.0
            Reporter: Lauri Koobas


The following is failing in DBR8.0 / Spark 3.1.1, but works in earlier DBR and 
Spark versions:
{quote}with step_1 as (
    select 'E' as name, named_struct('subfield', 1) as field_1
)
select name, collect_list(struct(field_1.subfield))
from step_1
group by 1
{quote}
Fails with the following error message:
{quote}AnalysisException: cannot resolve 'struct(step_1.`field_1`.`subfield`)' 
due to data type mismatch: Only foldable string expressions are allowed to 
appear at odd position, got: NamePlaceholder
{quote}
If you modify the query in any of the following ways then it still works::
 * if you remove the field "name" and the "group by 1" part of the query
 * if you remove the "struct()" from within the collect_list()
 * if you use "named_struct()" instead of "struct()" within the collect_list()

Similarly collect_set() is broken and possibly more related functions, but I 
haven't done thorough testing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to