[
https://issues.apache.org/jira/browse/TEZ-2692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14662256#comment-14662256
]
Bikas Saha commented on TEZ-2692:
---------------------------------
The is Guava usage but there is no Guava in the pom file?
We need a test case where we parse simplehistory log and ATS log and verify
that both DAGInfo's are the same. Else we cannot be sure that going forward
things wont break.
Should we fix SimpleHistory logging to make sure that the reader does not have
workarounds like these? Or is the workaround for an organizational difference
between ATS format and SimpleHistory format that cannot be fixed in the product?
{code}+ long totalTime = vertexInfo.getLastTaskFinishTimeInterval() -
vertexInfo
+ .getFirstTaskStartTimeInterval();{code}
Why are we subtracting time intervals? If first and last task run for equal
times then totalTime == 0. Is that the intention? If not, then should we be
doing lastTaskFinishTime() - firstTaskStartTime();
TaskConcurrency could be figured out by sorting attempt events by startTime and
stopTime for all attempts that actually ran. And then walking that sorted list.
Inc counter for every startEvent and decrease the counter for every stopEvent.
This would create 2X number of points in the timeline (where X is the number of
attempts that actually ran) vs the current artificial 5 second boundary that
may be too small or too large depending on the job. Thoughts?
> bugfixes & enhancements related to job parser and analyzer
> ----------------------------------------------------------
>
> Key: TEZ-2692
> URL: https://issues.apache.org/jira/browse/TEZ-2692
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Rajesh Balamohan
> Assignee: Rajesh Balamohan
> Attachments: TEZ-2692.1.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)