Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/22326#discussion_r219675105
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -995,7 +995,8 @@ class Dataset[T] private[sql](
// After the cloning, left and right side will have distinct
expression ids.
val plan = withPlan(
Join(logicalPlan, right.logicalPlan, JoinType(joinType),
Some(joinExprs.expr)))
- .queryExecution.analyzed.asInstanceOf[Join]
+ .queryExecution.analyzed
+ val joinPlan = plan.collectFirst { case j: Join => j }.get
--- End diff --
For reviewer, we need this change cause the rule
`HandlePythonUDFInJoinCondition` will break the assumption about the join plan
after analyzing will only return Join. After we add the rule of handling python
udf, we'll add filter or project node on top of Join.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]