Gopal V created HIVE-22163: ------------------------------ Summary: CBO: Enabling CBO turns on stats estimation, even when the estimation is disabled Key: HIVE-22163 URL: https://issues.apache.org/jira/browse/HIVE-22163 Project: Hive Issue Type: Bug Reporter: Gopal V
{code} create table claims(claim_rec_id bigint, claim_invoice_num string, typ_c int); alter table claims update statistics set ('numRows'='1154941534','rawDataSize'='1135307527922'); set hive.stats.estimate=false; explain extended select count(1) from claims where typ_c=3; set hive.stats.ndv.estimate.percent=5e-7; explain extended select count(1) from claims where typ_c=3; {code} Expecting the standard /2 for the single filter, but we instead get 5 rows. {code} ' Map Operator Tree:' ' TableScan' ' alias: claims' ' filterExpr: (typ_c = 3) (type: boolean)' ' Statistics: Num rows: 1154941534 Data size: 4388777832 Basic stats: COMPLETE Column stats: NONE' ' GatherStats: false' ' Filter Operator' ' isSamplingPred: false' ' predicate: (typ_c = 3) (type: boolean)' ' Statistics: Num rows: 5 Data size: 19 Basic stats: COMPLETE Column stats: NONE' {code} The estimation is in effect, as changing the estimate.percent changes this. {code} ' filterExpr: (typ_c = 3) (type: boolean)' ' Statistics: Num rows: 1154941534 Data size: 4388777832 Basic stats: COMPLETE Column stats: NONE' ' GatherStats: false' ' Filter Operator' ' isSamplingPred: false' ' predicate: (typ_c = 3) (type: boolean)' ' Statistics: Num rows: 230988307 Data size: 877755567 Basic stats: COMPLETE Column stats: NONE' {code} -- This message was sent by Atlassian Jira (v8.3.2#803003)