Hi,
Info about Data : The data is auto partitioned tpch 0.01 data. The second
filter is a non-partitioned column, so in the first case the 'OR' predicate
results in a full-table scan, while in the second case, partition pruning
takes effect.
The first case results in a hash agg and the second case in a streaming
agg. Any idea why?
1. explain plan for select distinct l_modline, l_moddate from
`tpch_multiple_partitions/lineitem_twopart` where l_moddate=date
'1992-01-01' or l_shipdate=date'1992-01-01';
+------+------+
| text | json |
+------+------+
| 00-00 Screen
00-01 Project(l_modline=[$0], l_moddate=[$1])
00-02 Project(l_modline=[$0], l_moddate=[$1])
00-03 HashAgg(group=[{0, 1}])
00-04 Project(l_modline=[$2], l_moddate=[$0])
00-05 SelectionVectorRemover
00-06 Filter(condition=[OR(=($0, 1992-01-01), =($1,
1992-01-01))])
00-07 Project(l_moddate=[$2], l_shipdate=[$1],
l_modline=[$0])
00-08 Scan..........
2. explain plan for select distinct l_modline, l_moddate from
`tpch_multiple_partitions/lineitem_twopart` where l_moddate=date
'1992-01-01' and l_shipdate=date'1992-01-01';
+------+------+
| text | json |
+------+------+
| 00-00 Screen
00-01 Project(l_modline=[$0], l_moddate=[$1])
00-02 Project(l_modline=[$0], l_moddate=[$1])
00-03 StreamAgg(group=[{0, 1}])
00-04 Sort(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC])
00-05 Project(l_modline=[$2], l_moddate=[$0])
00-06 SelectionVectorRemover
00-07 Filter(condition=[AND(=($0, 1992-01-01), =($1,
1992-01-01))])
00-08 Project(l_moddate=[$2], l_shipdate=[$1],
l_modline=[$0])
00-09 Scan.....................
- Rahul