[
https://issues.apache.org/jira/browse/SPARK-17337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15452419#comment-15452419
]
Nattavut Sutyanyong commented on SPARK-17337:
---------------------------------------------
@hvanhovell has observed that by disabling the rule
{{PushPredicateThroughJoin}} will solve this problem. The predicate is pushed
down because the code thinks all the columns in the LeftAnti predicate
reference to the T2, the right table of (T1 LOJ T2). The fact is some of the
C2#77 in the LeftAnti predicate are from the T2 in the NOT IN subquery.
However, {{PushPredicateThroughJoin}} is not the root cause of the problem. The
problem is the identifier {{[value#75 AS c2#77]}} is used in two places from
the two unrelated references of {{SubqueryAlias t2}}. This can be demonstrated
by replacing the T2 in the subquery by a different name, SQ in the example
below. We will not see the LeftAnti predicate pushed down below the LOJ.
{noformat}
Seq(1).toDF("cx").createOrReplaceTempView("sq")
scala> sql("select * from (select t2.c2+1 as c3 from t1 left join t2 on
t1.c1=t2.c2) t3 where c3 not in (select cx from sq)").explain(true)
...
== Optimized Logical Plan ==
Project [(c2#77 + 1) AS c3#137]
+- Join LeftAnti, (isnull(((c2#77 + 1) = cx#133)) || ((c2#77 + 1) = cx#133))
:- Join LeftOuter, (c1#3 = c2#77)
: :- LocalRelation [c1#3]
: +- LocalRelation [c2#77]
+- LocalRelation [cx#133]
...
{noformat}
> Incomplete algorithm for name resolution in Catalyst paser may lead to
> incorrect result
> ---------------------------------------------------------------------------------------
>
> Key: SPARK-17337
> URL: https://issues.apache.org/jira/browse/SPARK-17337
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.0.0
> Reporter: Nattavut Sutyanyong
>
> While investigating SPARK-16951, I found an incorrect results case from a NOT
> IN subquery. I thought originally it is an edge case. Further investigation
> found this is a more general problem.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]