GitHub user eatoncys opened a pull request:

    https://github.com/apache/spark/pull/22561

    [SPARK-25548][SQL]In the PruneFileSourcePartitions optimizer, replace the 
nonPartitionOps field with true in the And(partitionOps, nonPartitionOps) to 
make the partition can be pruned

    ## What changes were proposed in this pull request?
    In the PruneFileSourcePartitions optimizer, the partition files will not be 
pruned if we use partition filter and non partition filter together, for 
example:
    
        sql("CREATE TABLE IF NOT EXISTS src_par (key INT, value STRING) 
partitioned by(p_d int) stored as parquet ")
        sql("insert overwrite table src_par partition(p_d=2) select 2 as key, 
'4' as value")
        sql("insert overwrite table src_par partition(p_d=3) select 3 as key, 
'4' as value")
        sql("insert overwrite table src_par partition(p_d=4) select 4 as key, 
'4' as value")
    
    Before this PR, the sql below will scan all the partition files, in which, 
the partition **p_d=4** should be pruned.
        **sql("select * from src_par where (p_d=2 and key=2) or (p_d=3 and 
key=3)").show**
    After this PR, the partition **p_d=4** will be pruned
    
    ## How was this patch tested?
    exist test


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/eatoncys/spark partitionFilter

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22561.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22561
    
----
commit 6acb460381c96fe71f807f94bb617f3928f41694
Author: 10129659 <chen.yanshan@...>
Date:   2018-09-27T01:04:20Z

    In the PruneFileSourcePartitions optimizer, replace the nonPartitionOps 
field with true in the And(partitionOps, nonPartitionOps) to make the partition 
can be pruned

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to