Gopal V created HIVE-22163:
------------------------------

             Summary: CBO: Enabling CBO turns on stats estimation, even when 
the estimation is disabled
                 Key: HIVE-22163
                 URL: https://issues.apache.org/jira/browse/HIVE-22163
             Project: Hive
          Issue Type: Bug
            Reporter: Gopal V


{code}
create table claims(claim_rec_id bigint, claim_invoice_num string, typ_c int);
alter table claims update statistics set 
('numRows'='1154941534','rawDataSize'='1135307527922');


set hive.stats.estimate=false;

explain extended select count(1) from claims where typ_c=3;

set hive.stats.ndv.estimate.percent=5e-7;

explain extended select count(1) from claims where typ_c=3;
{code}

Expecting the standard /2 for the single filter, but we instead get 5 rows.

{code}
'            Map Operator Tree:'
'                TableScan'
'                  alias: claims'
'                  filterExpr: (typ_c = 3) (type: boolean)'
'                  Statistics: Num rows: 1154941534 Data size: 4388777832 Basic 
stats: COMPLETE Column stats: NONE'
'                  GatherStats: false'
'                  Filter Operator'
'                    isSamplingPred: false'
'                    predicate: (typ_c = 3) (type: boolean)'
'                    Statistics: Num rows: 5 Data size: 19 Basic stats: 
COMPLETE Column stats: NONE'
{code}

The estimation is in effect, as changing the estimate.percent changes this.

{code}
'                  filterExpr: (typ_c = 3) (type: boolean)'
'                  Statistics: Num rows: 1154941534 Data size: 4388777832 Basic 
stats: COMPLETE Column stats: NONE'
'                  GatherStats: false'
'                  Filter Operator'
'                    isSamplingPred: false'
'                    predicate: (typ_c = 3) (type: boolean)'
'                    Statistics: Num rows: 230988307 Data size: 877755567 Basic 
stats: COMPLETE Column stats: NONE'
{code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to