It would be a good idea to have Mapper and Reducer expose a getLogger () method. It could be extending a seperate interface like Loggable. The logger is initialized when the Map and Reduce tasks are initialized. The logger will be named using the job Id in the end - like hadoop.mapred.jobs.<jobid>. This enables user written map reduce code to log to a common logger for the job. Hadoop code can also log to the same logger in case of failures etc.

These logs can then be redirected to a specific hadoop directory for the job, in this case logs from all nodes running the same MR task will be available in a single directory in DFS. This will also help in separating hadoop's internal logs from the user logs without any logging configuration on user's part.

thoughts?

~Sanjay

Reply via email to