Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/22518 @gengliangwang no, let me cite and explain the PR description. I am not sure how to improve it, but if you have suggestions I am happy to. The main point of the PR is to address an issue which arise when: > When a `ExecSubqueryExpression` is copied Now the point is, can this condition happen? The answer is yes, and one situation in which this happens (as reported in the JIRA) is > when a filter containing a scalar subquery is pushed to a DataSource. So in the plan we have two `ExecSubqueryExpression` each with a copy of the same `SubqueryExec`. The problem which arises in this condition is that: > `ReuseSubquery` becomes useless, as replacing the `SubqueryExec` is ignored since the new plan is equal to the previous one. So this result in the subquery being executed twice (as the two `SubqueryExec` are distinct, despite they are the same).
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org