[ https://issues.apache.org/jira/browse/SPARK-28654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wenchen Fan resolved SPARK-28654. --------------------------------- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 25386 [https://github.com/apache/spark/pull/25386] > Move "Extract Python UDFs" to the last in optimizer > --------------------------------------------------- > > Key: SPARK-28654 > URL: https://issues.apache.org/jira/browse/SPARK-28654 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 3.0.0 > Reporter: Hyukjin Kwon > Assignee: Hyukjin Kwon > Priority: Major > Fix For: 3.0.0 > > > Plans after "Extract Python UDFs" are very flaky and error-prone to other > plans. For instance, > if we add some rules, for instance, [{PushDownPredicates}}, > The optimization is rolled back as below: > {code} > === Applying Rule org.apache.spark.sql.catalyst.optimizer.PushDownPredicates > === > !Filter (dummyUDF(a#7, c#18) = dummyUDF(d#19, c#18)) Join Cross, > (dummyUDF(a#7, c#18) = dummyUDF(d#19, c#18)) > !+- Join Cross :- Project [_1#2 AS > a#7, _2#3 AS b#8] > ! :- Project [_1#2 AS a#7, _2#3 AS b#8] : +- LocalRelation > [_1#2, _2#3] > ! : +- LocalRelation [_1#2, _2#3] +- Project [_1#13 AS > c#18, _2#14 AS d#19] > ! +- Project [_1#13 AS c#18, _2#14 AS d#19] +- LocalRelation > [_1#13, _2#14] > ! +- LocalRelation [_1#13, _2#14] > {code} > Seems we should do Python UDFs cases at the last even after post hoc rules. -- This message was sent by Atlassian JIRA (v7.6.14#76016) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org