[
https://issues.apache.org/jira/browse/MAPREDUCE-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116729#comment-16116729
]
Haibo Chen edited comment on MAPREDUCE-6892 at 8/7/17 3:18 PM:
---------------------------------------------------------------
bq. Shall we go ahead and do the rename where it's necessary?
I think yes for all source code references whenever the field name does not
correspond to what it really represents , but not the schema for compatibility
purpose. For schema, if possible, we can add a comment there for clarity
bq. I'm not exactly sure what is parses but I doubt that would contain string
like KILLED_MAPS or KILLED_REDUCES.
-1 in this context means unavailable or unknown. Job20LineHistoryEventEmitter
is where we generate some of the events. I guess it really does not matter, so
let's keep -1s.
bq. That was a copy-paste from an existing testcase
testHistoryParsingForFailedAttempts(). You sure we don't need that?
MRAppWithHistoryWithFailedAttempt always kills the 1st map task and fails the
1st reduce task. We want to test if the failed/killed/succeeded task counters
are correct/ # of failed/killed task attempts are not directly related and in
addition, how many attempts each task can have is configurable, may not be 2
necessarily. Thus, I suggested getting rid of noOffailedAttempts, adding
assertion of # of succeeded tasks.
was (Author: haibochen):
bq. Shall we go ahead and do the rename where it's necessary?
I think yes for all source code references whenever the field name does not
correspond to what it really represents , but not the schema for compatibility
purpose.
bq. I'm not exactly sure what is parses but I doubt that would contain string
like KILLED_MAPS or KILLED_REDUCES.
-1 in this context means unavailable or unknown. Job20LineHistoryEventEmitter
is where we generate some of the events. I guess it really does not matter, so
let's keep -1s.
bq. That was a copy-paste from an existing testcase
testHistoryParsingForFailedAttempts(). You sure we don't need that?
MRAppWithHistoryWithFailedAttempt always kills the 1st map task and fails the
1st reduce task. We want to test if the failed/killed/succeeded task counters
are correct/ # of failed/killed task attempts are not directly related and in
addition, how many attempts each task can have is configurable, may not be 2
necessarily. Thus, I suggested getting rid of noOffailedAttempts, adding
assertion of # of succeeded tasks.
> Issues with the count of failed/killed tasks in the jhist file
> --------------------------------------------------------------
>
> Key: MAPREDUCE-6892
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6892
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: client, jobhistoryserver
> Reporter: Peter Bacsko
> Assignee: Peter Bacsko
> Attachments: MAPREDUCE-6892-001.patch, MAPREDUCE-6892-002.PATCH
>
>
> Recently we encountered some issues with the value of failed tasks. After
> parsing the jhist file, {{JobInfo.getFailedMaps()}} returned 0, but actually
> there were failures.
> Another minor thing is that you cannot get the number of killed tasks
> (although this can be calculated).
> The root cause is that {{JobUnsuccessfulCompletionEvent}} contains only the
> successful map/reduce task counts. Number of failed (or killed) tasks are not
> stored.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]