srowen commented on a change in pull request #30203:
URL: https://github.com/apache/spark/pull/30203#discussion_r516730762
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala
##########
@@ -218,13 +218,22 @@ object ExtractPythonUDFs extends Rule[LogicalPlan] with
PredicateHelper {
}
}
+ private def canonicalizeDeterministic(u: PythonUDF) = {
Review comment:
Indeed, the example in the PR should have c and d equal no matter what,
ideally, as it shouldn't be evaluated multiple times. But then, being
deterministic doesn't matter; how would changing the default be the right fix?
I don't doubt it should be 'fixed' so that even non-deterministic UDFs aren't
surprisingly reevaluated. Is the point that the current handling for
non-deterministic UDFs slower, even though it evaluates them just once by
design? that seems weird.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]