Github user nsyca commented on the issue:
https://github.com/apache/spark/pull/14719
@sarutak, on the surface, the problem looks like in the Optimization code
but in fact, the root cause is the column/ExprId C2#77 from T2 are
indistinguishable between the two streams referencing the relation T2, one in
the right table of the LEFT JOIN and the other in the IN subquery. This further
makes the Optimization rule ```PushPredicateThroughJoin``` thinks the
expression c2#77 + 1 (from the projection of LEFT JOIN = c2#77 (from the IN
subquery converted to Semi-join) is a local predicate over the LEFT JOIN and
hence pushes it down below the LEFT JOIN.
My comments in
[SPARK-17337](https://issues.apache.org/jira/browse/SPARK-17337) on 31/Aug/16
14:42 and 14:43 explain in more details.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]