Github user maryannxue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22030#discussion_r208410422
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala ---
@@ -384,6 +392,10 @@ class RelationalGroupedDataset protected[sql](
.sort(pivotColumn) // ensure that the output columns are in a
consistent logical order
.collect()
.map(_.get(0))
+ .collect {
+ case row: GenericRow => struct(row.values.map(lit): _*)
--- End diff --
I suspect this will not work for nested struct types, or say, multiple
pivot columns with nested type. Could you please add a test like:
```
test("pivoting column list") {
val expected = ...
val df = trainingSales
.groupBy($"sales.year")
.pivot(struct($"sales", $"training"))
.agg(sum($"sales.earnings"))
checkAnswer(df, expected)
}
```
And can we also check if it works for other complex nested types, like
Array(Struct(...))?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]