[jira] [Resolved] (HIVE-28396) Increase Tez container & AM memory size to address OOM issues

Kokila N (Jira) Tue, 01 Jul 2025 01:07:46 -0700


     [ 
https://issues.apache.org/jira/browse/HIVE-28396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Kokila N resolved HIVE-28396.
-----------------------------
    Fix Version/s: NA
       Resolution: Duplicate

Fixed in HIVE-26473

> Increase Tez container & AM memory size to address OOM issues
> -------------------------------------------------------------
>
>                 Key: HIVE-28396
>                 URL: https://issues.apache.org/jira/browse/HIVE-28396
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: tanishqchugh
>            Assignee: tanishqchugh
>            Priority: Major
>             Fix For: NA
>
>         Attachments: groupBy_3_map_multi_distinct_proof_128_vs_256.png, 
> mm_all_2.png, mm_all_3.png
>
>
> Increasing the tez container & AM memory sizes to 256 to address the 
> occurring OOM issues.
> Increase in the sizes causes 3 qtests to fail: mm_all.q, mm_dp.q, 
> groupby3_map_multi_distinct.q . Analyze the failures and fix them.
> Analysis:
> *groupby3_map_multi_distinct Analysis:*
> We had increased the tez container size from 128 to 256mb to address OOM 
> errors. Now this qtest has a property - {{set hive.map.aggr=true;}} . If this 
> property is set to true, a background check runs first named - 
> {{{}checkMapSideAggregation(){}}}, to verify that there is enough space 
> available to store the hash table that would be required in order to do this 
> aggregation. The allotted space for this aggregation is half of container 
> size and with half of 128mb, it was not enough to store this generated table, 
> but with half of 256mb, now it is sufficient to store this table and hence 
> map side aggregation happens. With this aggregation, the hashes for only 
> these 307 distinct rows out of 500 rows are generated and stored and 
> duplicate rows are mapped to this hashes. Thus, the change in statistics 
> which is expected.
> !groupBy_3_map_multi_distinct_proof_128_vs_256.png|width=847,height=320!
> *mm_all Analysis:*
> We had increased the tez container size from 128 to 256mb to address OOM 
> errors. Now, total memory allocated to LLAP daemon is 4096mb and with each 
> container size increased to 256 mb, available slots = 4096/256 = 16
> With increased container size, split size increases and thus each task have 
> higher resources. Due to this, each task computes larger number of rows and 
> corrresponds to one hive side file each. The amount of data processed remains 
> the same, just the amount of data processed by each task increases. Thus only 
> 16 hive files are generated.
>   !mm_all_2.png|width=841,height=255!
> !mm_all_3.png|width=845,height=507!
> *mm_dp Analysis:*
> The error in this test case arised only because of difference in the random 
> numbers generated. The random number generation not only depends on the seed 
> value passed but also on the available task resources. As above, the task 
> resources have increased and each task processes higher number of rows, 
> generating higher number of random numbers, the random numbers generated are 
> different bw container sizes of 128 and 256.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (HIVE-28396) Increase Tez container & AM memory size to address OOM issues

Reply via email to