[jira] [Commented] (TEZ-3732) Reduce Object size of InputAttemptIdentifier and MapOutput for large jobs

Jonathan Eagles (JIRA) Tue, 16 May 2017 22:57:41 -0700

    [ 
https://issues.apache.org/jira/browse/TEZ-3732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013576#comment-16013576
 ]


Jonathan Eagles commented on TEZ-3732:
--------------------------------------

[~gopalv], it will be interesting to print the jol internals before/after for 
this case. I'll try to post real numbers tomorrow. If this shows positive, I'll 
extend this jira to FetchedInput to cover the unordered case as well.

> Reduce Object size of InputAttemptIdentifier and MapOutput for large jobs
> -------------------------------------------------------------------------
>
>                 Key: TEZ-3732
>                 URL: https://issues.apache.org/jira/browse/TEZ-3732
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Jonathan Eagles
>            Assignee: Jonathan Eagles
>         Attachments: TEZ-3732.1.patch
>
>
> Objects in 64bit java are 12bytes + member size aligned to 8 bytes
> InputAttemptIdentifier -> 33Bytes gets aligned up to 40 bytes
> This class is just one byte over the 32 byte alignment. Reducing object size 
> by one byte can save 8 bytes per object.
> This is ~8MB savings for 1,000,000 inputs and ~80 MB savings for tasks with 
> 10,000,000 inputs to fetch (Yes this is a real job)
> MapOutput -> 45 bytes gets aligned to 48 bytes
> This class can be sub-classed to avoid all sub-classes paying the object size 
> cost for the other sub-classes
> Wait InMemory and DiskDirect -> 32 bytes
> Disk -> 40 bytes
> Total savings is harder to account for but more than the above case.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (TEZ-3732) Reduce Object size of InputAttemptIdentifier and MapOutput for large jobs

Reply via email to