GitHub user DonnyZone opened a pull request:
https://github.com/apache/spark/pull/19178
[SPARK-21966][SQL]ResolveMissingReference rule should not ignore Union
## What changes were proposed in this pull request?
https://issues.apache.org/jira/browse/SPARK-21966
The problem can be reproduced by following example.
`val df1 = spark.createDataFrame(Seq((1, 1), (2, 1), (2, 2))).toDF("a", "b")
val df2 = spark.createDataFrame(Seq((1, 1), (1, 2), (2, 3))).toDF("a", "b")
val df3 = df1.cube("a").sum("b")
val df4 = df2.cube("a").sum("b")
val df5 = df3.union(df4).filter("grouping_id()=0").show()`
The `org.apache.spark.sql.AnalysisException: cannot resolve
'`spark_grouping_id`' given input columns`
is thrown as the ResolveMissingReference rule ignore the Union operator.
This PR fix the issue.
## How was this patch tested?
unit tests
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/DonnyZone/spark ResolveMissingReference
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/19178.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #19178
----
commit 29ae28598c69bb4cb26ad66c86e6a73054e5fbef
Author: donnyzone <[email protected]>
Date: 2017-09-10T09:26:36Z
SPARK-21966
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]