[ http://issues.apache.org/jira/browse/HADOOP-92?page=comments#action_12371457 ]
Doug Cutting commented on HADOOP-92: ------------------------------------ Another approach to this would be to expose this through the JobClient API. Job-related events can be reported to the job client. Events can be queued in the job tracker and the JobClient can retrieve them as it polls for job status. Then the JobClient can decide where to log them. By default they can be logged to standard error. The events I think one might care about are: - task start (task_id, type & host) - task completion (task_id) - task failure (task_id, error message) The jobtracker already tracks most of this, so I don't think this places a huge new burden on the jobtracker. I don't like polluting the job's output directory with log data since it would require changes to the InputFormat implementations and other code to make them skip this specially named sub-directory (unless the name begins with a dot, which the fs code already ignores). In any case, we could add an option to JobClient to log to the job's output fs. > Error Reporting/logging in MapReduce > ------------------------------------ > > Key: HADOOP-92 > URL: http://issues.apache.org/jira/browse/HADOOP-92 > Project: Hadoop > Type: Bug > Components: mapred > Reporter: Mahadev konar > Priority: Minor > > Currently Mapreduce does not tell you which machine failed to execute the > task. Also, it would be nice to have features wherein there is a log report > with each job, saying the number of tasks it ran (reporting which one failed > and on which machine, listing any error information it can) with the > start/end/execute time of each task. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
