[ 
https://issues.apache.org/jira/browse/TEZ-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14326802#comment-14326802
 ] 

Rohini Palaniswamy commented on TEZ-2119:
-----------------------------------------

[~sseth],
     Getting older :(. You were right. 
https://issues.apache.org/jira/browse/TEZ-987 is the one. Probably use this one 
for counters and use the other one to implement APIs? I was recently running a 
pig script on a very small queue which can run only 76 containers at a time. I 
was hoping it would be the same 76 containers reused over and over for the 33K 
tasks, but it was launching new containers often. I am wondering if it was 
because of data locality. Did not get to reading the AM logs yet as the size is 
~350M and was feeling lazy to dig in.  Is there something else that can be 
added for this? Swimlanes may be useful to get some idea on container reuse. 
But I am thinking more in terms of being able to mine later with job stats 
populated in hive tables.

> Counter for launched containers
> -------------------------------
>
>                 Key: TEZ-2119
>                 URL: https://issues.apache.org/jira/browse/TEZ-2119
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Rohini Palaniswamy
>
> org.apache.tez.common.counters.DAGCounter
>                 NUM_SUCCEEDED_TASKS=32976
>                 TOTAL_LAUNCHED_TASKS=32976
>                 OTHER_LOCAL_TASKS=2
>                 DATA_LOCAL_TASKS=9147
>                 RACK_LOCAL_TASKS=23761
> It would be very nice to have TOTAL_LAUNCHED_CONTAINERS counter added to 
> this. The difference between TOTAL_LAUNCHED_CONTAINERS and 
> TOTAL_LAUNCHED_TASKS should make it easy to see how much container reuse is 
> happening. It is very hard to find out now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to