[ 
https://issues.apache.org/jira/browse/TEZ-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14240414#comment-14240414
 ] 

Rajesh Balamohan commented on TEZ-1610:
---------------------------------------

[~hitesh] - "SHUFFLE_TIME_AS_PERCENTAGE" would provide the "real" time spent by 
fetcher threads in pulling the data over the wire.  However, when it gets 
rolled up at vertex or DAG level, the counter can be misleading as it gets 
aggregated (1000% or 2000% percentage would be confusing).  That was the reason 
for having FETCHER_TIME_MILLIS and FETCHER_TIME_MILLIS_IN_SHUFFLE (would need 
better naming convention).  This way, it would be possible to express the total 
amount of time available for all the fetcher threads and the cumulative 
fraction of time spent by these fetcher threads in doing real data transfer.  
By looking at the task level counter (FETCHER_TIME_MILLIS_IN_SHUFFLE and  
FETCHER_TIME_MILLIS), it would be possible to determine whether shuffle is slow 
due to network or not.

For further analysis, user might need to make use of 
LAST_SHUFFLE_EVENT_TIMESTAMP
- Time taken for shuffle = (FETCHER_TIME_MILLIS / numFetchers)
- Absolute timestamp for shuffle finish = (task start timestamp in taskCounter 
+ time taken for shuffle)
- If shuffle phase is slow and If last shuffle event is closer to shuffle 
finish timestamp, user can determine that it was slow due to source.

Aggregating this at vertex level still can lead to issues, but need to document 
that the accuracy could be wrong when viewed at vertex/dag level (ideal would 
be to have a set of counters which are visible only at task level.  Not sure, 
if we can do that as of today).

Alternate option is add more information in the existing logs, but we have to 
let the users parse the logs to understand things like 
1. Is the shuffle phase slow due to network?
2. Is it slow because the last event arrived late (due to source being slow)
3. Is it slow due to merge issues or memory pressures?

> additional task counters for fetchers
> -------------------------------------
>
>                 Key: TEZ-1610
>                 URL: https://issues.apache.org/jira/browse/TEZ-1610
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Rajesh Balamohan
>            Assignee: Rajesh Balamohan
>         Attachments: TEZ-1610.1.patch, TEZ-1610.2.patch
>
>
> - ShuffleFinishTime (per source)
> - Merge time (depending on broadcast/scatter-gather shuffle)
> This would be helpful in determining when shuffle started/ended for different 
> sources in a task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to