peter-toth commented on a change in pull request #30203:
URL: https://github.com/apache/spark/pull/30203#discussion_r516836683



##########
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala
##########
@@ -218,13 +218,22 @@ object ExtractPythonUDFs extends Rule[LogicalPlan] with 
PredicateHelper {
     }
   }
 
+  private def canonicalizeDeterministic(u: PythonUDF) = {

Review comment:
       I think @cloud-fan was referring to that if we changed the default to 
non-deterministic then some of the optimization rules would not handle those 
UDF expressions and would leave them untouched. E.g. `PushDownPredicates` would 
not push them down, which could cause performance regression.
   
   IMHO, it is the user's responsibility to set the deterministic flag right 
regardless what is the default. And if a UDF is flagged deterministic we should 
do the optimizations.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to