Repository: spark Updated Branches: refs/heads/branch-2.0 5a4a188fe -> 0ab195886
[SPARK-14986][SQL] Return correct result for empty LATERAL VIEW OUTER ## What changes were proposed in this pull request? A Generate with the `outer` flag enabled should always return one or more rows for every input row. The optimizer currently violates this by rewriting `outer` Generates that do not contain columns of the child plan into an unjoined generate, for example: ```sql select e from a lateral view outer explode(a.b) as e ``` The result of this is that `outer` Generate does not produce output at all when the Generators' input expression is empty. This PR fixes this. ## How was this patch tested? Added test case to `SQLQuerySuite`. Author: Herman van Hovell <[email protected]> Closes #12906 from hvanhovell/SPARK-14986. (cherry picked from commit d28c67544b26c38d51a31d1f8dac3fc23860e1ef) Signed-off-by: Yin Huai <[email protected]> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/0ab19588 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/0ab19588 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/0ab19588 Branch: refs/heads/branch-2.0 Commit: 0ab1958868e265dc13c066f5ec26e573a94a2490 Parents: 5a4a188 Author: Herman van Hovell <[email protected]> Authored: Tue May 10 12:47:31 2016 -0700 Committer: Yin Huai <[email protected]> Committed: Tue May 10 12:47:45 2016 -0700 ---------------------------------------------------------------------- .../org/apache/spark/sql/catalyst/optimizer/Optimizer.scala | 3 ++- .../src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala | 7 +++++++ 2 files changed, 9 insertions(+), 1 deletion(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/0ab19588/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---------------------------------------------------------------------- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala index a3ab89d..350b601 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala @@ -362,7 +362,8 @@ object ColumnPruning extends Rule[LogicalPlan] { g.copy(child = prunedChild(g.child, g.references)) // Turn off `join` for Generate if no column from it's child is used - case p @ Project(_, g: Generate) if g.join && p.references.subsetOf(g.generatedSet) => + case p @ Project(_, g: Generate) + if g.join && !g.outer && p.references.subsetOf(g.generatedSet) => p.copy(child = g.copy(join = false)) // Eliminate unneeded attributes from right side of a Left Existence Join. http://git-wip-us.apache.org/repos/asf/spark/blob/0ab19588/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ---------------------------------------------------------------------- diff --git a/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala index e401abe..4ef4b48 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala @@ -2473,4 +2473,11 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext { Row("r3c1x", "r3c2", "t1r3c3", "r3c2", "t1r3c3") :: Nil) } } + + test("SPARK-14986: Outer lateral view with empty generate expression") { + checkAnswer( + sql("select nil from (select 1 as x ) x lateral view outer explode(array()) n as nil"), + Row(null) :: Nil + ) + } } --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
