[ 
https://issues.apache.org/jira/browse/TEZ-3202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238669#comment-15238669
 ] 

Jonathan Eagles commented on TEZ-3202:
--------------------------------------

Created a dedicated object to track the key value buffers.

[~rajesh.balamohan], can have a look at the new patch? This drastically reduces 
the memory overhead per segment. This allows much greater parallelism reduction 
without the risk of OOMs.

> Reduce the memory need for jobs with high number of segments
> ------------------------------------------------------------
>
>                 Key: TEZ-3202
>                 URL: https://issues.apache.org/jira/browse/TEZ-3202
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Jonathan Eagles
>            Assignee: Jonathan Eagles
>         Attachments: TEZ-3202.1.patch, TEZ-3202.2.patch, TEZ-3202.3.patch, 
> TEZ-3202.4-branch-0.7.patch, TEZ-3202.4.patch
>
>
> Segment has a 'key' member that holds accounting information to the reader's 
> current key buffer, position, and length. There is a 384 byte overhead per 
> segment since the account is done with the DataInputBuffer class which 
> derives from DataInputStream which has underlying byte[80] and char[80] among 
> significant pieces. This jira aims to reduce the overhead per segment



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to