[GitHub] [spark] huaxingao opened a new pull request #33680: [MINOR][SQL] Not push down partition filter to ORCScan for DSv2

GitBox Sun, 08 Aug 2021 16:01:52 -0700


huaxingao opened a new pull request #33680:
URL: https://github.com/apache/spark/pull/33680



   ### What changes were proposed in this pull request?
   not push down partition filter to `ORCScan` for DSv2
   
   
   ### Why are the changes needed?
   Seems to me that partition filter is only used for partition pruning and 
shouldn't be pushed down to `ORCScan`. We don't push down partition filter to 
ORCScan in DSv1
   ```
   == Physical Plan ==
   *(1) Filter (isnotnull(value#19) AND NOT (value#19 = a))
   +- *(1) ColumnarToRow
      +- FileScan orc [value#19,p1#20,p2#21] Batched: true, DataFilters: 
[isnotnull(value#19), NOT (value#19 = a)], Format: ORC, Location: 
InMemoryFileIndex(1 
paths)[file:/private/var/folders/pt/_5f4sxy56x70dv9zpz032f0m0000gn/T/spark-c1...,
 PartitionFilters: [isnotnull(p1#20), isnotnull(p2#21), (p1#20 = 1), (p2#21 = 
2)], PushedFilters: [IsNotNull(value), Not(EqualTo(value,a))], ReadSchema: 
struct<value:string>
   ```
   Also, we don't push down partition filter for parquet in DSv2.
   https://github.com/apache/spark/pull/30652
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   Existing test suites
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] huaxingao opened a new pull request #33680: [MINOR][SQL] Not push down partition filter to ORCScan for DSv2

Reply via email to