[
https://issues.apache.org/jira/browse/MAPREDUCE-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123608#comment-16123608
]
Haibo Chen commented on MAPREDUCE-6892:
---------------------------------------
Thanks [~pbacsko] for updating the patch! A few more comments:
1) In JobFinishedEvent, can we add code in getDatum and setDatum() to handle
the newly added fields?
2) In UnparsedJob, let return -1 to be consistent with getCompletedMaps() and
getCompletedReduce(). Similarly for PartialJob, let's also return -1 to
indicate the info is not available.
3) JobImpl.getCompletedMaps() return successMap + killedMap + FailedMap,
whereas CompletedJob.getCompletedMaps() returns only successMap.
Let's do the same in CompletedJob.getCompletedMaps() as well as in
CompletedJob.getCompletedReduce().
4) in Job20LineHistoryEventEmitter, how much work is it to also parse the
failed/killed map/reducer counters (I have not familiar with this code)? I am
OK to leave it if it is too much.
5) Not an issue with this patch, but let's also set killed/failed counters in
JobHistoryParser.handleJobFinishedEvent()
6) CompletedJob.getKillReduces() should return (int) jobInfo.getKilledReduces();
7) Rename JobSummary.getNumFinishedMaps() to getNumSucceededMaps(). Also, let's
add summary.setNumKilled[Map/Reduce] in TestJobSummary.before() as well.
Can you look into the test failure and fix it if possible?
> Issues with the count of failed/killed tasks in the jhist file
> --------------------------------------------------------------
>
> Key: MAPREDUCE-6892
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6892
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: client, jobhistoryserver
> Reporter: Peter Bacsko
> Assignee: Peter Bacsko
> Attachments: MAPREDUCE-6892-001.patch, MAPREDUCE-6892-002.PATCH,
> MAPREDUCE-6892-003.patch
>
>
> Recently we encountered some issues with the value of failed tasks. After
> parsing the jhist file, {{JobInfo.getFailedMaps()}} returned 0, but actually
> there were failures.
> Another minor thing is that you cannot get the number of killed tasks
> (although this can be calculated).
> The root cause is that {{JobUnsuccessfulCompletionEvent}} contains only the
> successful map/reduce task counts. Number of failed (or killed) tasks are not
> stored.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]