Hello,

I am new to Spark and just running some tests to get familiar with the APIs.

When calling the rollup function on my DataFrame, I get different results
when I alias the columns I am grouping on (see below for example data set).
I was expecting alias function to only affect the column name. Why is it
also affecting the rollup results?
(I know I can rename my columns after the rollup call, using
withColumnRenamed function, my question is just to get better understanding
of alias function.)

scala> df.show
+----+------+-----+
|Name|  Game|Score|
+----+------+-----+
| Bob|Game 1|   20|
| Bob|Game 2|   30|
| Lea|Game 1|   25|
| Lea|Game 2|   30|
| Ben|Game 1|    5|
| Ben|Game 3|   35|
| Bob|Game 3|   15|
+----+------+-----+

//rollup results as expected
scala> df.rollup(df("Name"), df("Game")).sum().orderBy("Name", "Game").show
+----+------+----------+
|Name|  Game|SUM(Score)|
+----+------+----------+
|null|  null|       160|
| Ben|  null|        40|
| Ben|Game 1|         5|
| Ben|Game 3|        35|
| Bob|  null|        65|
| Bob|Game 1|        20|
| Bob|Game 2|        30|
| Bob|Game 3|        15|
| Lea|  null|        55|
| Lea|Game 1|        25|
| Lea|Game 2|        30|
+----+------+----------+

//rollup with aliases return strange results
scala> df.rollup(df("Name") as "Player", df("Game") as
"Round").sum().orderBy("Player", "Round").show
+------+------+----------+
|Player| Round|SUM(Score)|
+------+------+----------+
|   Ben|Game 1|         5|
|   Ben|Game 1|         5|
|   Ben|Game 1|         5|
|   Ben|Game 3|        35|
|   Ben|Game 3|        35|
|   Ben|Game 3|        35|
|   Bob|Game 1|        20|
|   Bob|Game 1|        20|
|   Bob|Game 1|        20|
|   Bob|Game 2|        30|
|   Bob|Game 2|        30|
|   Bob|Game 2|        30|
|   Bob|Game 3|        15|
|   Bob|Game 3|        15|
|   Bob|Game 3|        15|
|   Lea|Game 1|        25|
|   Lea|Game 1|        25|
|   Lea|Game 1|        25|
|   Lea|Game 2|        30|
|   Lea|Game 2|        30|
+------+------+----------+


Thanks in advance for your help,

Isabelle

Reply via email to