Hi,

I'd like to discuss two things regarding logging and debugging:

1) Crunch currently ships a log4j.properties which can have precedence
over users' log4j.properties, depending on classpath order. Libraries
should never ship logging config as it forces users to repackage
Crunch if they want to use their own. Our Nexus at work has a nice
collection of repackaged libs.

2) Discussion about Pipeline.enableDebug() came up in CRUNCH-70. I
believe it really shouldn't mess with logging configuration. Right now
it bypasses the commons-logging facade and directly accesses log4j,
causing a compile time dependency on log4j. It changes VM-wide state
beyond Crunch as other Hadoop-related code executed afterwards will
get changed logging config, too. And, most importantly, it's the
responsibility of the operations team, not the developer to configure
logging. Admins are used to log4j.properties, we shouldn't invent
another non-standard way of doing things that overrides the usual
way.

My vote for 1) would be to remove our log4j.properties.

For 2) I think the best solution would be to offer an example
log4j.properties in our documentation (section "Debugging Your
Pipelines" or something) that has the effect Pipeline.enableDebug()
has now.

Regards,
  Matthias

Reply via email to