[
https://issues.apache.org/jira/browse/HIVE-20366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16577725#comment-16577725
]
Vineet Garg commented on HIVE-20366:
------------------------------------
bq. There are few commented out asserts. Is that required?
Yes this assumption is not true anymore so had to be removed.
> TPC-DS query78 stats estimates are off for is null filter
> ---------------------------------------------------------
>
> Key: HIVE-20366
> URL: https://issues.apache.org/jira/browse/HIVE-20366
> Project: Hive
> Issue Type: Bug
> Components: Query Planning
> Reporter: Vineet Garg
> Assignee: Vineet Garg
> Priority: Major
> Attachments: HIVE-20366.1.patch, HIVE-20366.2.patch,
> HIVE-20366.3.patch
>
>
> In Query 78, there is Left outer join between fact table combos: stores_sales
> LOJ store_returns, catalog_sales LOJ catalog_returns and web_sales LOJ
> web_returns. Each of these joins estimates only a single row and the result
> is BROADCAST and causes hash table memory errors
> {code}
> Reducer 12 |
> | Execution mode: vectorized, llap |
> | Reduce Operator Tree: |
> +----------------------------------------------------+
> | Explain |
> +----------------------------------------------------+
> | Map Join Operator |
> | condition map: |
> | Left Outer Join 0 to 1 |
> | keys: |
> | 0 KEY.reducesinkkey0 (type: bigint), KEY.reducesinkkey1
> (type: bigint) |
> | 1 KEY.reducesinkkey0 (type: bigint), KEY.reducesinkkey1
> (type: bigint) |
> | outputColumnNames: _col0, _col1, _col3, _col4, _col5,
> _col6, _col8 |
> | input vertices: |
> | 1 Map 14 |
> | Statistics: Num rows: 10282477384 Data size: 534184867432
> Basic stats: COMPLETE Column stats: COMPLETE |
> | Filter Operator |
> | predicate: _col8 is null (type: boolean) |
> | * Statistics: Num rows: 1* Data size: 52 Basic stats:
> COMPLETE Column stats: COMPLETE |
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)