Zoltan Haindrich created HIVE-22811:
---------------------------------------
Summary: Statistics are not exploit in nested cases
Key: HIVE-22811
URL: https://issues.apache.org/jira/browse/HIVE-22811
Project: Hive
Issue Type: Improvement
Components: Statistics
Reporter: Zoltan Haindrich
The statsOptimizer is able to use min/max/etc values to service simple queries
{code}
(select max(id) from t t0)
{code}
however the same doesn't happen for queries like:
{code}
explain select * from u where u.id>(select max(id) from t t0);
{code}
explain:
{code}
| Plan optimized by CBO. |
| |
| Vertex dependency in root stage |
| Reducer 3 <- Map 1 (BROADCAST_EDGE), Map 2 (CUSTOM_SIMPLE_EDGE) |
| |
| Stage-0 |
| Fetch Operator |
| limit:-1 |
| Stage-1 |
| Reducer 3 vectorized |
| File Output Operator [FS_31] |
| Select Operator [SEL_30] (rows=1 width=8) |
| Output:["_col0","_col1"] |
| Filter Operator [FIL_29] (rows=1 width=12) |
| predicate:(_col0 > _col2) |
| Map Join Operator [MAPJOIN_28] (rows=3 width=12) |
| Conds:(Inner),Output:["_col0","_col1","_col2"] |
| <-Map 1 [BROADCAST_EDGE] vectorized |
| BROADCAST [RS_25] |
| Select Operator [SEL_24] (rows=3 width=8) |
| Output:["_col0","_col1"] |
| Filter Operator [FIL_23] (rows=3 width=8) |
| predicate:id is not null |
| TableScan [TS_0] (rows=3 width=8) |
|
default@u,u,Tbl:COMPLETE,Col:COMPLETE,Output:["id","cnt"] |
| <-Filter Operator [FIL_27] (rows=1 width=4) |
| predicate:_col0 is not null |
| Group By Operator [GBY_26] (rows=1 width=4) |
| Output:["_col0"],aggregations:["max(VALUE._col0)"] |
| <-Map 2 [CUSTOM_SIMPLE_EDGE] vectorized |
| PARTITION_ONLY_SHUFFLE [RS_22] |
| Group By Operator [GBY_21] (rows=1 width=4) |
| Output:["_col0"],aggregations:["max(id)"] |
| Select Operator [SEL_20] (rows=4 width=4) |
| Output:["id"] |
| TableScan [TS_3] (rows=4 width=4) |
|
default@t,t0,Tbl:COMPLETE,Col:COMPLETE,Output:["id"] |
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)