[ https://issues.apache.org/jira/browse/MAPREDUCE-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123608#comment-16123608 ]
Haibo Chen commented on MAPREDUCE-6892: --------------------------------------- Thanks [~pbacsko] for updating the patch! A few more comments: 1) In JobFinishedEvent, can we add code in getDatum and setDatum() to handle the newly added fields? 2) In UnparsedJob, let return -1 to be consistent with getCompletedMaps() and getCompletedReduce(). Similarly for PartialJob, let's also return -1 to indicate the info is not available. 3) JobImpl.getCompletedMaps() return successMap + killedMap + FailedMap, whereas CompletedJob.getCompletedMaps() returns only successMap. Let's do the same in CompletedJob.getCompletedMaps() as well as in CompletedJob.getCompletedReduce(). 4) in Job20LineHistoryEventEmitter, how much work is it to also parse the failed/killed map/reducer counters (I have not familiar with this code)? I am OK to leave it if it is too much. 5) Not an issue with this patch, but let's also set killed/failed counters in JobHistoryParser.handleJobFinishedEvent() 6) CompletedJob.getKillReduces() should return (int) jobInfo.getKilledReduces(); 7) Rename JobSummary.getNumFinishedMaps() to getNumSucceededMaps(). Also, let's add summary.setNumKilled[Map/Reduce] in TestJobSummary.before() as well. Can you look into the test failure and fix it if possible? > Issues with the count of failed/killed tasks in the jhist file > -------------------------------------------------------------- > > Key: MAPREDUCE-6892 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6892 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client, jobhistoryserver > Reporter: Peter Bacsko > Assignee: Peter Bacsko > Attachments: MAPREDUCE-6892-001.patch, MAPREDUCE-6892-002.PATCH, > MAPREDUCE-6892-003.patch > > > Recently we encountered some issues with the value of failed tasks. After > parsing the jhist file, {{JobInfo.getFailedMaps()}} returned 0, but actually > there were failures. > Another minor thing is that you cannot get the number of killed tasks > (although this can be calculated). > The root cause is that {{JobUnsuccessfulCompletionEvent}} contains only the > successful map/reduce task counts. Number of failed (or killed) tasks are not > stored. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org