Hi all,

I'm a bit confused by the way logging works on Hadoop.
In short, my question is: where does the log from my Nutch plugins end up when running on Hadoop?

I'm running Nutch 0.9 on Hadoop 0.12.2.
When I run my code on a single machine I can see that the log ends up in ${hadoop.log.dir}/${hadoop.log.file}, as defined in the log4j.properties file (Nutch and my plugins use commons-logging). But when I use Hadoop, I can't find any logfile which contains log entries generated by the (map|reduce) tasks. However, I do find logfiles which contain log from Hadoop-related classes (like the tasktracker, jobtracker etc).

I first thought it had something to do with HADOOP-406 (http://issues.apache.org/jira/browse/HADOOP-406) which is about the fact that environment parameters passed to the parent JVM (like 'hadoop.log.file') are not passed to the child JVM's. But even when I specify my log file explicitly (without the use of environment vars) in the log4j.properties, I still see no log entries other than the Hadoop classes.

Any clues?
Mathijs

Reply via email to