[
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14597126#comment-14597126
]
Mostafa Mokhtar commented on HIVE-10233:
----------------------------------------
[~hagleitn] [~vikram.dixit] [~wzheng]
It would make sense to annotate the explain plan with memory assigned to each
Hash table, as in
{code}
DagName: jenkins_20150622122318_f770d9ab-0ddd-43cf-b950-32f38e2f17e1:1
Vertices:
Map 1
Map Operator Tree:
TableScan
alias: store_sales
filterExpr: (ss_item_sk is not null and ss_sold_date_sk
BETWEEN 2450816 AND 2451500) (type: boolean)
Statistics: Num rows: 28878719387 Data size: 2405805439460
Basic stats: COMPLETE Column stats: COMPLETE
Filter Operator
predicate: ss_item_sk is not null (type: boolean)
Statistics: Num rows: 28878719387 Data size: 231029755096
Basic stats: COMPLETE Column stats: COMPLETE
Map Join Operator
condition map:
Inner Join 0 to 1
keys:
0 ss_item_sk (type: int)
1 i_item_sk (type: int)
outputColumnNames: _col1, _col22, _col26
input vertices:
1 Map 3
Statistics: Num rows: 28878719387 Data size: 346544632644
Basic stats: COMPLETE Column stats: COMPLETE
HybridGraceHashJoin: true Hash table memory : 1848000
Bytes
Filter Operator
predicate: ((_col26 = _col1) and _col22 BETWEEN 2450816
AND 2451500) (type: boolean)
Statistics: Num rows: 7219679846 Data size: 86636158152
Basic stats: COMPLETE Column stats: COMPLETE
Select Operator
Statistics: Num rows: 7219679846 Data size:
86636158152 Basic stats: COMPLETE Column stats: COMPLETE
Group By Operator
aggregations: count()
mode: hash
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 8 Basic stats:
COMPLETE Column stats: COMPLETE
Reduce Output Operator
sort order:
Statistics: Num rows: 1 Data size: 8 Basic stats:
COMPLETE Column stats: COMPLETE
value expressions: _col0 (type: bigint)
Execution mode: vectorized
Map 3
Map Operator Tree:
TableScan
alias: item
filterExpr: i_item_sk is not null (type: boolean)
Statistics: Num rows: 462000 Data size: 663560457 Basic
stats: COMPLETE Column stats: COMPLETE
Filter Operator
predicate: i_item_sk is not null (type: boolean)
Statistics: Num rows: 462000 Data size: 1848000 Basic
stats: COMPLETE Column stats: COMPLETE
Reduce Output Operator
key expressions: i_item_sk (type: int)
sort order: +
Map-reduce partition columns: i_item_sk (type: int)
Statistics: Num rows: 462000 Data size: 1848000 Basic
stats: COMPLETE Column stats: COMPLETE
Execution mode: vectorized
{code}
> Hive on tez: memory manager for grace hash join
> -----------------------------------------------
>
> Key: HIVE-10233
> URL: https://issues.apache.org/jira/browse/HIVE-10233
> Project: Hive
> Issue Type: Bug
> Components: Tez
> Affects Versions: llap, 2.0.0
> Reporter: Vikram Dixit K
> Assignee: Gunther Hagleitner
> Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch,
> HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch, HIVE-10233-WIP-6.patch,
> HIVE-10233-WIP-7.patch, HIVE-10233-WIP-8.patch, HIVE-10233.08.patch,
> HIVE-10233.09.patch, HIVE-10233.10.patch, HIVE-10233.11.patch
>
>
> We need a memory manager in llap/tez to manage the usage of memory across
> threads.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)