[
https://issues.apache.org/jira/browse/TEZ-2888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955935#comment-14955935
]
Bikas Saha commented on TEZ-2888:
---------------------------------
1) Fixes bad results when AM crashes and dag end time is not recorded. Now we
rebase to the last attempt finish time for drawing the svg
2) Changes straggler identification to only look at the time spent after the
last data event was received. this should prevent attempts from looking like
stragglers because they were launched too early and spent most of their time
waiting.
3) Calculates the number of running tasks with time along the job timeline.
currently its used to get a guesstimate of the actual observed max concurrency
in the job. In the absence of any definitive values from YARN, this would help
provide some insight into what actually happened in terms of concurrency. Used
max concurrency value to update some wave based heuristics for allocation
overhead diagnostics. This concurrency with time series could be used in other
things e.g. drawing it along with the critical path to aid in visual debugging.
[~rajesh.balamohan] please review. Thanks!
> Make critical path calculation resilient to AM crash
> ----------------------------------------------------
>
> Key: TEZ-2888
> URL: https://issues.apache.org/jira/browse/TEZ-2888
> Project: Apache Tez
> Issue Type: Sub-task
> Reporter: Bikas Saha
> Assignee: Bikas Saha
> Attachments: TEZ-2888.1.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)