[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13182592#comment-13182592
 ] 

Robert Joseph Evans commented on MAPREDUCE-3511:
------------------------------------------------

I have a few comments about the patch.  Overall it looks very good.
 
If the old counter API is going to be the long term fix then perhaps we should 
not mark is as deprecated any more.  If not then can we file a new JIRA to 
update the new Counters to be as space efficient as the old ones. 

This is a minor performance improvement, but in many of the HistoryEvents there 
is a datum, that has had most of its fields replicated inside the event itself. 
 I guess this is because they are in slightly different formats now.  But we 
still create an instance of datum in the constructor, and always populate its 
fields inside getDatum.  These event objects are imutable, outside of the 
setDatum method that is only to be used when de-serializing the event.  
getDatum, however, tends to be called repeatedly to pull out individual fields 
from the datum.  I would prefer to see the datum start out as null, and only 
have its fields set if it is not null, inside getDatum.

Why was TokenCache.java modified at all? It does not seem to be related to this 
JIRA.

You added a TODO in CompletedTask.java and CompletedJob.java {code}// TODO: 
Make sure.{code} Did you make sure yet? if so please delete the TODO.

Also good catch on TestHsWebServicesTasks.java, TestAMWebServicesAttempts.java, 
TestHsWebServicesTasks.java and the others.  I also don't think we need any 
more tests because, all we are doing is reducing memory usage, which is very 
hard to write a unit test for. 

Inside JobHistoryEventHandler.java you added in {code}// TODO: Only 
job-counters is enough? How about the myriad clones in this code-path.{code} is 
this TODO still needed?

Like I said before, overall the patch looks really good and I only have some 
minor comments, thanks for doing this.
                
> Counters occupy a good part of AM heap
> --------------------------------------
>
>                 Key: MAPREDUCE-3511
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3511
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>          Components: mr-am, mrv2
>    Affects Versions: 0.23.0
>            Reporter: Siddharth Seth
>            Assignee: Vinod Kumar Vavilapalli
>            Priority: Blocker
>             Fix For: 0.23.1
>
>         Attachments: MAPREDUCE-3511-20120107.1.txt
>
>
> Per task counters seem to be occupying a good part of an AMs heap. Looks like 
> more than 50% of what's used by a TaskAttemptImpl object.
> This could be optimized by interning strings or possibly using mrv1 counters 
> which are optimized. Currently counters are converted from mrv1 to mrv2 
> format for in memory storage. The conversion could be delayed till it's 
> actually required for RPC transfers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to