On Saturday, 2012-09-22, Josh Wills wrote: [...] > Perhaps I'm being dense-- once we remove log4j.properties from the > core, won't we stop logging the Crunch status information unless the > developer explicitly configures it in their own log4j.properties? > That may of course be what the developer desires, but my thought was > that we typically do want that information logged, and that > repeating it in every log4j.properties file for every project would > be tedious.
At second glance it seems there are a few misconceptions on how logging works with Hadoop. When running from the IDE with LocalJobRunner, log4j.properties has no effect because there's no log4j on the classpath; commons-logging just uses java.util.logging which logs on INFO level. Users who have log4j on the classpath will see different results, but they will have to add it themselves, as neither hadoop-client nor hadoop-core will provide it for them. When running jobs using "hadoop jar", our log4j.properties doesn't have an effect either because Hadoop's conf/log4j.properties takes precedence (even with HADOOP_USER_CLASSPATH_FIRST). This surprises me, to be honest, but Crunch's INFO messages are still printed, which is good. What also surprises me is that the log4j setup code in enableDebug() doesn't seem to have any effect; AFAICS, Hadoop INFO messages already appear on the console. All I get when calling enableDebug() is a log message "Could not find console appender 'A'". Is there something broken? What's the intention behind the log4j code in enableDebug()? Unless I'm overlooking something (quite possible) this seems like a no-op. > Assuming that is the case, perhaps we could add a default > log4j.properties file to the maven archetype that generates Crunch > projects? Ah, that reminds me: We haven't decided yet if we want an archetype in Crunch. Regards, Matthias
