viirya commented on a change in pull request #25925: 
[SPARK-29239][SPARK-29221][SQL] Subquery should not cause NPE when eliminating 
subexpression
URL: https://github.com/apache/spark/pull/25925#discussion_r328442649
 
 

 ##########
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala
 ##########
 @@ -72,7 +73,10 @@ class EquivalentExpressions {
     val skip = expr.isInstanceOf[LeafExpression] ||
       // `LambdaVariable` is usually used as a loop variable, which can't be 
evaluated ahead of the
       // loop. So we can't evaluate sub-expressions containing 
`LambdaVariable` at the beginning.
-      expr.find(_.isInstanceOf[LambdaVariable]).isDefined
+      expr.find(_.isInstanceOf[LambdaVariable]).isDefined ||
+      // `PlanExpression` wraps query plan. To compare query plans of 
`PlanExpression` on executor,
+      // can cause error like NPE.
+      (expr.isInstanceOf[PlanExpression[_]] && TaskContext.get != null)
 
 Review comment:
   Not sure I understand your question correctly. But PlanExpressions of a 
SparkPlan are evaluated and updated (e.g., ExecSubqueryExpression.updateResult) 
with values before a query begins to run. The values are kept in 
PlanExpression, and on executor side when to call eval of PlanExpression, it 
simply returns the kept value. I think we do not really evaluate a 
PlanExpression at executor side.
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to