spark git commit: [SPARK-25505][SQL][FOLLOWUP] Fix for attributes cosmetically different in Pivot clause

lixiao Sun, 30 Sep 2018 22:08:30 -0700

Repository: spark
Updated Branches:
  refs/heads/branch-2.4 c886f050b -> 7b1094b54



[SPARK-25505][SQL][FOLLOWUP] Fix for attributes cosmetically different in Pivot 
clause

## What changes were proposed in this pull request?

#22519 introduced a bug when the attributes in the pivot clause are 
cosmetically different from the output ones (eg. different case). In 
particular, the problem is that the PR used a `Set[Attribute]` instead of an 
`AttributeSet`.

## How was this patch tested?

added UT

Closes #22582 from mgaido91/SPARK-25505_followup.

Authored-by: Marco Gaido <marcogaid...@gmail.com>
Signed-off-by: gatorsmile <gatorsm...@gmail.com>
(cherry picked from commit fb8f4c05657595e089b6812d97dbfee246fce06f)
Signed-off-by: gatorsmile <gatorsm...@gmail.com>


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/7b1094b5
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/7b1094b5
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/7b1094b5

Branch: refs/heads/branch-2.4
Commit: 7b1094b54c3810b4c0b02ba14d282f44be0813c3
Parents: c886f05
Author: Marco Gaido <marcogaid...@gmail.com>
Authored: Sun Sep 30 22:08:04 2018 -0700
Committer: gatorsmile <gatorsm...@gmail.com>
Committed: Sun Sep 30 22:08:19 2018 -0700

----------------------------------------------------------------------
 .../scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala | 3 +--
 sql/core/src/test/resources/sql-tests/inputs/pivot.sql          | 5 +++--
 sql/core/src/test/resources/sql-tests/results/pivot.sql.out     | 4 ++--
 3 files changed, 6 insertions(+), 6 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/7b1094b5/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
----------------------------------------------------------------------
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
index d303b43..fdb68dd 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
@@ -555,8 +555,7 @@ class Analyzer(
         }
         // Group-by expressions coming from SQL are implicit and need to be 
deduced.
         val groupByExprs = groupByExprsOpt.getOrElse {
-          val pivotColAndAggRefs =
-            (pivotColumn.references ++ aggregates.flatMap(_.references)).toSet
+          val pivotColAndAggRefs = pivotColumn.references ++ 
AttributeSet(aggregates)
           child.output.filterNot(pivotColAndAggRefs.contains)
         }
         val singleAgg = aggregates.size == 1

http://git-wip-us.apache.org/repos/asf/spark/blob/7b1094b5/sql/core/src/test/resources/sql-tests/inputs/pivot.sql
----------------------------------------------------------------------
diff --git a/sql/core/src/test/resources/sql-tests/inputs/pivot.sql 
b/sql/core/src/test/resources/sql-tests/inputs/pivot.sql
index 81547ab..c2ecd97 100644
--- a/sql/core/src/test/resources/sql-tests/inputs/pivot.sql
+++ b/sql/core/src/test/resources/sql-tests/inputs/pivot.sql
@@ -289,11 +289,12 @@ PIVOT (
 );
 
 -- grouping columns output in the same order as input
+-- correctly handle pivot columns with different cases
 SELECT * FROM (
   SELECT course, earnings, "a" as a, "z" as z, "b" as b, "y" as y, "c" as c, 
"x" as x, "d" as d, "w" as w
   FROM courseSales
 )
 PIVOT (
-  sum(earnings)
-  FOR course IN ('dotNET', 'Java')
+  sum(Earnings)
+  FOR Course IN ('dotNET', 'Java')
 );

http://git-wip-us.apache.org/repos/asf/spark/blob/7b1094b5/sql/core/src/test/resources/sql-tests/results/pivot.sql.out
----------------------------------------------------------------------
diff --git a/sql/core/src/test/resources/sql-tests/results/pivot.sql.out 
b/sql/core/src/test/resources/sql-tests/results/pivot.sql.out
index 487883a..595ce1f 100644
--- a/sql/core/src/test/resources/sql-tests/results/pivot.sql.out
+++ b/sql/core/src/test/resources/sql-tests/results/pivot.sql.out
@@ -484,8 +484,8 @@ SELECT * FROM (
   FROM courseSales
 )
 PIVOT (
-  sum(earnings)
-  FOR course IN ('dotNET', 'Java')
+  sum(Earnings)
+  FOR Course IN ('dotNET', 'Java')
 )
 -- !query 31 schema
 
struct<a:string,z:string,b:string,y:string,c:string,x:string,d:string,w:string,dotNET:bigint,Java:bigint>


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-25505][SQL][FOLLOWUP] Fix for attributes cosmetically different in Pivot clause

Reply via email to