[
https://issues.apache.org/jira/browse/HIVE-8316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mostafa Mokhtar updated HIVE-8316:
----------------------------------
Description:
CBO underestimates selectivity from filter which consequently results in under
estimation throughout the plan.
{code}
8 rows, 0.0 cpu, 0.0 io}, id = 7808
HiveJoinRel(condition=[=($0, $12)],
joinType=[inner]): rowcount = 11459.928208333333, cumulative cost =
{5.50076555E8 rows, 0.0 cpu, 0.0 io}, id = 7426
HiveProjectRel(ss_item_sk=[$1],
ss_customer_sk=[$2], ss_cdemo_sk=[$3], ss_hdemo_sk=[$4], ss_addr_sk=[$5],
ss_store_sk=[$6], ss_promo_sk=[$7], ss_ticket_number=[$8],
ss_wholesale_cost=[$10], ss_list_price=[$11], ss_coupon_amt=[$18],
ss_sold_date_sk=[$22]): rowcount = 5.50076554E8, cumulative cost = {0.0 rows,
0.0 cpu, 0.0 io}, id = 893
HiveTableScanRel(table=[[tpcds_bin_partitioned_orc_200_orig.store_sales]]):
rowcount = 5.50076554E8, cumulative cost = {0}, id = 55
HiveProjectRel(i_item_sk=[$0],
i_current_price=[$5], i_color=[$17], i_product_name=[$21]): rowcount = 1.0,
cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 1163
HiveFilterRel(condition=[AND(in($17, 'maroon', 'burnished', 'dim', 'steel',
'navajo', 'chocolate'), between(false, $5, 35, +(35, 10)), between(false, $5,
+(35, 1), +(35, 15)))]): rowcount = 1.0, cumulative cost = {0.0 rows, 0.0 cpu,
0.0 io}, id = 1161
HiveTableScanRel(table=[[tpcds_bin_partitioned_orc_200_orig.item]]): rowcount =
48000.0, cumulative cost = {0}, id = 68
{code}
{code}
select count(*) from item where i_color in
('maroon','burnished','dim','steel','navajo','chocolate') and
i_current_price between 35 and 35 + 10 and
i_current_price between 35 + 1 and 35 + 15;
{code}
was:
CBO underestimates selectivity from filter which consequently results in under
estimation throughout the plan.
{code}
{code}
> CBO : cardinality estimation for filters is much lower than actual row count
> ----------------------------------------------------------------------------
>
> Key: HIVE-8316
> URL: https://issues.apache.org/jira/browse/HIVE-8316
> Project: Hive
> Issue Type: Bug
> Components: CBO
> Affects Versions: 0.14.0
> Reporter: Mostafa Mokhtar
> Assignee: Harish Butani
> Fix For: 0.14.0
>
>
> CBO underestimates selectivity from filter which consequently results in
> under estimation throughout the plan.
> {code}
> 8 rows, 0.0 cpu, 0.0 io}, id = 7808
> HiveJoinRel(condition=[=($0,
> $12)], joinType=[inner]): rowcount = 11459.928208333333, cumulative cost =
> {5.50076555E8 rows, 0.0 cpu, 0.0 io}, id = 7426
> HiveProjectRel(ss_item_sk=[$1],
> ss_customer_sk=[$2], ss_cdemo_sk=[$3], ss_hdemo_sk=[$4], ss_addr_sk=[$5],
> ss_store_sk=[$6], ss_promo_sk=[$7], ss_ticket_number=[$8],
> ss_wholesale_cost=[$10], ss_list_price=[$11], ss_coupon_amt=[$18],
> ss_sold_date_sk=[$22]): rowcount = 5.50076554E8, cumulative cost = {0.0 rows,
> 0.0 cpu, 0.0 io}, id = 893
>
> HiveTableScanRel(table=[[tpcds_bin_partitioned_orc_200_orig.store_sales]]):
> rowcount = 5.50076554E8, cumulative cost = {0}, id = 55
> HiveProjectRel(i_item_sk=[$0],
> i_current_price=[$5], i_color=[$17], i_product_name=[$21]): rowcount = 1.0,
> cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 1163
>
> HiveFilterRel(condition=[AND(in($17, 'maroon', 'burnished', 'dim', 'steel',
> 'navajo', 'chocolate'), between(false, $5, 35, +(35, 10)), between(false, $5,
> +(35, 1), +(35, 15)))]): rowcount = 1.0, cumulative cost = {0.0 rows, 0.0
> cpu, 0.0 io}, id = 1161
>
> HiveTableScanRel(table=[[tpcds_bin_partitioned_orc_200_orig.item]]): rowcount
> = 48000.0, cumulative cost = {0}, id = 68
> {code}
> {code}
> select count(*) from item where i_color in
> ('maroon','burnished','dim','steel','navajo','chocolate') and
> i_current_price between 35 and 35 + 10 and
> i_current_price between 35 + 1 and 35 + 15;
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)