Github user chenghao-intel commented on the pull request:
https://github.com/apache/spark/pull/9055#issuecomment-149907407
Thank you @yhuai for reviewing this.
I've added some more docs for this PR, hopefully make more sense.
First, I'll agree with you to make a general logic to partially resolve the
correlated condition within the subquery, but it's probably not that easy,
particularly we need to give more concise error message to the end user, so my
suggestion is to leave it for the future improvement, probably we will have
better idea to simplify that by having enough feature supported with the follow
up PRs (See my TODO in the description), as currently, the limit patterns
actually works for most of cases.
Second, I totally agree with the Join Type comments, LeftSemiJoin <->
LeftSemi <-> LeftAnti, the motivation I am trying to make a parent class for
LeftSemi / LeftAnti is for reducing the code change in `Optimizer` and
`SparkStrategies`, maybe I should rename it to `LeftSemiOrAntiJoin` as the
parent class. As well as the Operators' name, since we no longer the
`LeftSemiXXX`, but also supports the `LeftAntixxx`.
Still, I hope we can merge this PR in 1.6 release, as it's almost 1 years
passed since the previous PRs created in #3249 & #4812. And I will keep
updating the code once we have the general agreement for the implementation.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]