This is an automated email from the ASF dual-hosted git repository.
gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new ce3b03d7bd1 [SPARK-42852][SQL] Revert NamedLambdaVariable related
changes from EquivalentExpressions
ce3b03d7bd1 is described below
commit ce3b03d7bd1964cbd8dd6b87edc024b38feaaffb
Author: Peter Toth <[email protected]>
AuthorDate: Mon Mar 20 09:55:09 2023 +0900
[SPARK-42852][SQL] Revert NamedLambdaVariable related changes from
EquivalentExpressions
### What changes were proposed in this pull request?
This PR reverts the follow-up PR of SPARK-41468:
https://github.com/apache/spark/pull/39046
### Why are the changes needed?
These changes are not needed and actually might cause performance
regression due to preventing higher order function subexpression elimination in
`EquivalentExpressions`. Please find related conversation here:
https://github.com/apache/spark/pull/40473#issuecomment-1474848224
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Existing UTs.
Closes #40475 from
peter-toth/SPARK-42852-revert-namedlambdavariable-changes.
Authored-by: Peter Toth <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
---
.../spark/sql/catalyst/expressions/EquivalentExpressions.scala | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala
index 330d66a21be..3ffd9f9d887 100644
---
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala
+++
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala
@@ -144,10 +144,9 @@ class EquivalentExpressions {
private def supportedExpression(e: Expression) = {
!e.exists {
- // `LambdaVariable` is usually used as a loop variable and
`NamedLambdaVariable` is used in
- // higher-order functions, which can't be evaluated ahead of the
execution.
+ // `LambdaVariable` is usually used as a loop variable, which can't be
evaluated ahead of the
+ // loop. So we can't evaluate sub-expressions containing
`LambdaVariable` at the beginning.
case _: LambdaVariable => true
- case _: NamedLambdaVariable => true
// `PlanExpression` wraps query plan. To compare query plans of
`PlanExpression` on executor,
// can cause error like NPE.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]