Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/22518
  
    @gengliangwang no, let me cite and explain the PR description. I am not 
sure how to improve it, but if you have suggestions I am happy to. The main 
point of the PR is to address an issue which arise when:
    
    > When a `ExecSubqueryExpression` is copied
    
    Now the point is, can this condition happen? The answer is yes, and one 
situation in which this happens (as reported in the JIRA) is
    
    > when a filter containing a scalar subquery is pushed to a DataSource.
    
    So in the plan we have two `ExecSubqueryExpression` each with a copy of the 
same `SubqueryExec`. The problem which arises in this condition is that:
    
    > `ReuseSubquery` becomes useless, as replacing the `SubqueryExec` is 
ignored since the new plan is equal to the previous one.
    
    So this result in the subquery being executed twice (as the two 
`SubqueryExec` are distinct, despite they are the same).
    



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to