Github user mgaido91 commented on a diff in the pull request:
https://github.com/apache/spark/pull/22326#discussion_r220568484
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala
---
@@ -152,3 +153,53 @@ object EliminateOuterJoin extends Rule[LogicalPlan]
with PredicateHelper {
if (j.joinType == newJoinType) f else Filter(condition,
j.copy(joinType = newJoinType))
}
}
+
+/**
+ * PythonUDF in join condition can not be evaluated, this rule will detect
the PythonUDF
+ * and pull them out from join condition. For python udf accessing
attributes from only one side,
+ * they would be pushed down by operation push down rules. If not(e.g.
user disables filter push
--- End diff --
nits:
- `they are`
- missing space before `(`
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]