GitHub user cloud-fan opened a pull request:

    https://github.com/apache/spark/pull/21230

    [SPARK-24172][SQL] we should not apply operator pushdown to data source v2 
many times

    ## What changes were proposed in this pull request?
    
    In `PushDownOperatorsToDataSource`, we use `transformUp` to match 
`PhysicalOperation` and apply pushdown. This is problematic if we have multiple 
`Filter` and `Project` above the data source v2 relation.
    
    e.g. for a query
    ```
    Project
      Filter
        DataSourceV2Relation
    ```
    
    The pattern match will be triggered twice and we will do operator pushdown 
twice. This is unnecessary, we can use `transformDown` to only apply pushdown 
once.
    
    ## How was this patch tested?
    
    existing test

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cloud-fan/spark step2

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21230.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21230
    
----
commit e224f8a798ed30319efab386720c997227e1b421
Author: Wenchen Fan <wenchen@...>
Date:   2018-05-03T15:14:01Z

    we should not apply operator pushdown to data source v2 many times

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to