Maysam Yabandeh created MAPREDUCE-5954:
------------------------------------------
Summary: Optional exclusion of counters from getTaskReports
Key: MAPREDUCE-5954
URL: https://issues.apache.org/jira/browse/MAPREDUCE-5954
Project: Hadoop Map/Reduce
Issue Type: Improvement
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh
MRClientService.getTaskReport returns the set of map or reduce tasks along with
their counters, which are quite large. For big jobs, the response could be as
large as 0.5 GB. This has a negative impact both on MRAppMaster and the
monitoring tool that invokes getTaskReports. This problem has led Pig users to
entirely disable getTaskReports for big jobs:
https://issues.apache.org/jira/browse/PIG-4043
Many monitoring tools, including ours, do not need the task counters when
invoking getTaskReports. Pig also does not make any use of task counters. Here
are the usages of Tasks in pig:
{code}
protected void getErrorMessages(TaskReport reports[], String type,
String msgs[] = reports[i].getDiagnostics();
if (HadoopShims.isJobFailed(reports[i])) {
{code}
and
{code}
protected long computeTimeSpent(TaskReport[] taskReports) {
long timeSpent = 0;
for (TaskReport r : taskReports) {
timeSpent += (r.getFinishTime() - r.getStartTime());
}
return timeSpent;
}
{code}
GetTaskReportsRequest can be augmented with an optional boolean with which the
monitoring tool can request excluding the counters form the response. This
minor change is very simple and yet makes many existing monitoring tools more
efficient.
--
This message was sent by Atlassian JIRA
(v6.2#6252)