AngersZhuuuu opened a new pull request #29406:
URL: https://github.com/apache/spark/pull/29406


   ### What changes were proposed in this pull request?
   We support partially push partition filters since SPARK-28169. We can also 
support partially push down data filters if it mixed in partition filters and 
data filters. For example:
   ```
   spark.sql(
     s"""
        |CREATE TABLE t(i INT, p STRING)
        |USING parquet
        |PARTITIONED BY (p)""".stripMargin)
   
   spark.range(0, 1000).selectExpr("id as col").createOrReplaceTempView("temp")
   for (part <- Seq(1, 2, 3, 4)) {
     sql(s"""
            |INSERT OVERWRITE TABLE t PARTITION (p='$part')
            |SELECT col FROM temp""".stripMargin)
   }
   
   spark.sql("SELECT * FROM t WHERE  WHERE (p = '1' AND i = 1) OR (p = '2' and 
i = 2)").explain()
   ```
   
   We can also push down ```i = 1 or i = 2 ```
   
   ### Why are the changes needed?
   Extract more data filter to FileSourceScanExec
   
   ### Does this PR introduce _any_ user-facing change?
   NO
   
   
   ### How was this patch tested?
   Added UT


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to