[
https://issues.apache.org/jira/browse/SPARK-21759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16131493#comment-16131493
]
Liang-Chi Hsieh commented on SPARK-21759:
-----------------------------------------
Submitted PR at https://github.com/apache/spark/pull/18968
> In.checkInputDataTypes should not wrongly report unresolved plans for IN
> correlated subquery
> --------------------------------------------------------------------------------------------
>
> Key: SPARK-21759
> URL: https://issues.apache.org/jira/browse/SPARK-21759
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.2.0
> Reporter: Liang-Chi Hsieh
>
> With the check for structural integrity proposed in SPARK-21726, I found that
> an optimization rule {{PullupCorrelatedPredicates}} can produce unresolved
> plans.
> For a correlated IN query like:
> {code}
> Project [a#0]
> +- Filter a#0 IN (list#4 [b#1])
> : +- Project [c#2]
> : +- Filter (outer(b#1) < d#3)
> : +- LocalRelation <empty>, [c#2, d#3]
> +- LocalRelation <empty>, [a#0, b#1]
> {code}
> After {{PullupCorrelatedPredicates}}, it produces query plan like:
> {code}
> 'Project [a#0]
> +- 'Filter a#0 IN (list#4 [(b#1 < d#3)])
> : +- Project [c#2, d#3]
> : +- LocalRelation <empty>, [c#2, d#3]
> +- LocalRelation <empty>, [a#0, b#1]
> {code}
> Because the correlated predicate involves another attribute {{d#3}} in
> subquery, it has been pulled out and added into the {{Project}} on the top of
> the subquery.
> When {{list}} in {{In}} contains just one {{ListQuery}},
> {{In.checkInputDataTypes}} checks if the size of {{value}} expressions
> matches the output size of subquery. In the above example, there is only
> {{value}} expression and the subquery output has two attributes {{c#2, d#3}},
> so it fails the check and {{In.resolved}} returns {{false}}.
> We should not let {{In.checkInputDataTypes}} wrongly report unresolved plans
> to fail the structural integrity check.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]