[
https://issues.apache.org/jira/browse/SPARK-33068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hyukjin Kwon resolved SPARK-33068.
----------------------------------
Resolution: Not A Problem
> Spark 2.3 vs Spark 1.6 collect_list giving different schema
> -----------------------------------------------------------
>
> Key: SPARK-33068
> URL: https://issues.apache.org/jira/browse/SPARK-33068
> Project: Spark
> Issue Type: IT Help
> Components: Spark Submit
> Affects Versions: 2.3.4
> Reporter: Ayush Goyal
> Priority: Major
>
> Hi,
> I am migrating from spark 1.6 to spark 2.3. However in collect_list I am
> getting different schema.
>
> {code:java}
> val df_date_agg = df
> .groupBy($"a",$"b",$"c")
> .agg(sum($"d").alias("data1"),sum($"e").alias("data2"))
> .groupBy($"a")
> .agg(collect_list(array($"b",$"c",$"data1")).alias("final_data1"),
> collect_list(array($"b",$"c",$"data2")).alias("final_data2"))
> {code}
> When I am running above line in spark 1.6 getting below schema
>
>
> {code:java}
> |-- final_data1: array (nullable = true)
> | |-- element: string (containsNull = true)
> |-- final_data2: array (nullable = true)
> | |-- element: string (containsNull = true)
> {code}
>
>
> but in spark 2.3 schema changed to
>
> {code:java}
> |-- final_data1: array (nullable = true)
> | |-- element: array (containsNull = true)
> | | |-- element: string (containsNull = true)
> |-- final_data1: array (nullable = true)
> | |-- element: array (containsNull = true)
> | | |-- element: string (containsNull = true)
> {code}
>
>
> In Spark 1.6 array($"b",$"c",$"data1") is converting to string like this
> {code:java}
> '[2020-09-26, Ayush, 103.67]'
> {code}
> In spark 2.3 it is converted to WrappedArray
> {code:java}
> WrappedArray(2020-09-26, Ayush, 103.67)
> {code}
> I want to keep my schema as it is Otherwise all the dependent codes have to
> change.
>
> Thanks
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]