got it. that sounds reasonable
On Fri, Feb 7, 2014 at 2:31 PM, Patrick Wendell <pwend...@gmail.com> wrote: > Koert - my suggestion was this. We let users use any slf4j backend > they want. If we detect that they are using the log4j backend and > *also* they didn't configure any log4j appenders, we set up some nice > defaults for them. If they are using another backend, Spark doesn't > try to modify the configuration at all. > > On Fri, Feb 7, 2014 at 11:14 AM, Koert Kuipers <ko...@tresata.com> wrote: > > well "static binding" is probably the wrong terminology but you get the > > idea. multiple backends are not allowed and cause an even uglier > warning... > > > > see also here: > > https://github.com/twitter/scalding/pull/636 > > and here: > > https://groups.google.com/forum/#!topic/cascading-user/vYvnnN_15ls > > all me being annoying and complaining about slf4j-log4j12 dependencies > > (which did get removed). > > > > > > On Fri, Feb 7, 2014 at 2:09 PM, Koert Kuipers <ko...@tresata.com> wrote: > > > >> the issue is that slf4j uses static binding. you can put only one slf4j > >> backend on the classpath, and that's what it uses. more than one is not > >> allowed. > >> > >> so you either keep the slf4j-log4j12 dependency for spark, and then you > >> took away people's choice of slf4j backend which is considered bad form > for > >> a library, or you do not include it and then people will always get the > big > >> fat ugly warning and slf4j logging will not flow to log4j. > >> > >> including log4j itself is not necessary a problem i think? > >> > >> > >> On Fri, Feb 7, 2014 at 1:11 PM, Patrick Wendell <pwend...@gmail.com > >wrote: > >> > >>> This also seems relevant - but not my area of expertise (whether this > >>> is a valid way to check this). > >>> > >>> > >>> > http://stackoverflow.com/questions/10505418/how-to-find-which-library-slf4j-has-bound-itself-to > >>> > >>> On Fri, Feb 7, 2014 at 10:08 AM, Patrick Wendell <pwend...@gmail.com> > >>> wrote: > >>> > Hey Guys, > >>> > > >>> > Thanks for explainning. Ya this is a problem - we didn't really know > >>> > that people are using other slf4j backends, slf4j is in there for > >>> > historical reasons but I think we may assume in a few places that > >>> > log4j is being used and we should minimize those. > >>> > > >>> > We should patch this and get a fix into 0.9.1. So some solutions I > see > >>> are: > >>> > > >>> > (a) Add SparkConf option to disable this. I'm fine with this one. > >>> > > >>> > (b) Ask slf4j which backend is active and only try to enforce this > >>> > default if we know slf4j is using log4j. Do either of you know if > this > >>> > is possible? Not sure if slf4j exposes this. > >>> > > >>> > (c) Just remove this default stuff. We'd rather not do this. The goal > >>> > of this thing is to provide good usability for people who have linked > >>> > against Spark and haven't done anything to configure logging. For > >>> > beginners we try to minimize the assumptions about what else they > know > >>> > about, and I've found log4j configuration is a huge mental barrier > for > >>> > people who are getting started. > >>> > > >>> > Paul if you submit a patch doing (a) we can merge it in. If you have > >>> > any idea if (b) is possible I prefer that one, but it may not be > >>> > possible or might be brittle. > >>> > > >>> > - Patrick > >>> > > >>> > On Fri, Feb 7, 2014 at 6:36 AM, Koert Kuipers <ko...@tresata.com> > >>> wrote: > >>> >> Totally agree with Paul: a library should not pick the slf4j > backend. > >>> It > >>> >> defeats the purpose of slf4j. That big ugly warning is there to > alert > >>> >> people that its their responsibility to pick the back end... > >>> >> On Feb 7, 2014 3:55 AM, "Paul Brown" <p...@mult.ifario.us> wrote: > >>> >> > >>> >>> Hi, Patrick -- > >>> >>> > >>> >>> From slf4j, you can either backend it into log4j (which is the way > >>> that > >>> >>> Spark is shipped) or you can route log4j through slf4j and then on > to > >>> a > >>> >>> different backend (e.g., logback). We're doing the latter and > >>> manipulating > >>> >>> the dependencies in the build because that's the way the enclosing > >>> >>> application is set up. > >>> >>> > >>> >>> The issue with the current situation is that there's no way for an > >>> end user > >>> >>> to choose to *not* use the log4j backend. (My short-term solution > >>> was to > >>> >>> use the Maven shade plugin to swap in a version of the Logging > trait > >>> with > >>> >>> the body of that method commented out.) In addition to the > situation > >>> with > >>> >>> log4j-over-slf4j and the empty enumeration of ROOT appenders, you > >>> might > >>> >>> also run afoul of someone who intentionally configured log4j with > an > >>> empty > >>> >>> set of appenders at the time that Spark is initializing. > >>> >>> > >>> >>> I'd be happy with any implementation that lets me choose my logging > >>> >>> backend: override default behavior via system property, plug-in > >>> >>> architecture, etc. I do think it's reasonable to expect someone > >>> digesting > >>> >>> a substantial JDK-based system like Spark to understand how to > >>> initialize > >>> >>> logging -- surely they're using logging of some kind elsewhere in > >>> their > >>> >>> application -- but if you want the default behavior there as a > >>> courtesy, it > >>> >>> might be worth putting an INFO (versus a the glaring log4j WARN) > >>> message on > >>> >>> the output that says something like "Initialized default logging > via > >>> Log4J; > >>> >>> pass -Dspark.logging.loadDefaultLogger=false to disable this > >>> behavior." so > >>> >>> that it's both convenient and explicit. > >>> >>> > >>> >>> Cheers. > >>> >>> -- Paul > >>> >>> > >>> >>> > >>> >>> > >>> >>> > >>> >>> > >>> >>> > >>> >>> -- > >>> >>> p...@mult.ifario.us | Multifarious, Inc. | http://mult.ifario.us/ > >>> >>> > >>> >>> > >>> >>> On Fri, Feb 7, 2014 at 12:05 AM, Patrick Wendell < > pwend...@gmail.com> > >>> >>> wrote: > >>> >>> > >>> >>> > A config option e.g. could just be to add: > >>> >>> > > >>> >>> > spark.logging.loadDefaultLogger (default true) > >>> >>> > If set to true, Spark will try to initialize a log4j logger if > none > >>> is > >>> >>> > detected. Otherwise Spark will not modify logging behavior. > >>> >>> > > >>> >>> > Then users could just set this to false if they have a logging > >>> set-up > >>> >>> > that conflicts with this. > >>> >>> > > >>> >>> > Maybe there is a nicer fix... > >>> >>> > > >>> >>> > On Fri, Feb 7, 2014 at 12:03 AM, Patrick Wendell < > >>> pwend...@gmail.com> > >>> >>> > wrote: > >>> >>> > > Hey Paul, > >>> >>> > > > >>> >>> > > Thanks for digging this up. I worked on this feature and the > >>> intent > >>> >>> > > was to give users good default behavior if they didn't include > any > >>> >>> > > logging configuration on the classpath. > >>> >>> > > > >>> >>> > > The problem with assuming that CL tooling is going to fix the > job > >>> is > >>> >>> > > that many people link against spark as a library and run their > >>> >>> > > application using their own scripts. In this case the first > thing > >>> >>> > > people see when they run an application that links against > Spark > >>> was a > >>> >>> > > big ugly logging warning. > >>> >>> > > > >>> >>> > > I'm not super familiar with log4j-over-slf4j, but this > behavior of > >>> >>> > > returning null for the appenders seems a little weird. What is > >>> the use > >>> >>> > > case for using this and not just directly use slf4j-log4j12 > like > >>> Spark > >>> >>> > > itself does? > >>> >>> > > > >>> >>> > > Did you have a more general fix for this in mind? Or was your > >>> plan to > >>> >>> > > just revert the existing behavior... We might be able to add a > >>> >>> > > configuration option to disable this logging default stuff. Or > we > >>> >>> > > could just rip it out - but I'd like to avoid that if possible. > >>> >>> > > > >>> >>> > > - Patrick > >>> >>> > > > >>> >>> > > On Thu, Feb 6, 2014 at 11:41 PM, Paul Brown < > p...@mult.ifario.us> > >>> >>> wrote: > >>> >>> > >> We have a few applications that embed Spark, and in 0.8.0 and > >>> 0.8.1, > >>> >>> we > >>> >>> > >> were able to use slf4j, but 0.9.0 broke that and > unintentionally > >>> >>> forces > >>> >>> > >> direct use of log4j as the logging backend. > >>> >>> > >> > >>> >>> > >> The issue is here in the org.apache.spark.Logging trait: > >>> >>> > >> > >>> >>> > >> > >>> >>> > > >>> >>> > >>> > https://github.com/apache/incubator-spark/blame/master/core/src/main/scala/org/apache/spark/Logging.scala#L107 > >>> >>> > >> > >>> >>> > >> log4j-over-slf4j *always* returns an empty enumeration for > >>> appenders > >>> >>> to > >>> >>> > the > >>> >>> > >> ROOT logger: > >>> >>> > >> > >>> >>> > >> > >>> >>> > > >>> >>> > >>> > https://github.com/qos-ch/slf4j/blob/master/log4j-over-slf4j/src/main/java/org/apache/log4j/Category.java?source=c#L81 > >>> >>> > >> > >>> >>> > >> And this causes an infinite loop and an eventual stack > overflow. > >>> >>> > >> > >>> >>> > >> I'm happy to submit a Jira and a patch, but it would be > >>> significant > >>> >>> > enough > >>> >>> > >> reversal of recent changes that it's probably worth discussing > >>> before > >>> >>> I > >>> >>> > >> sink a half hour into it. My suggestion would be that > >>> initialization > >>> >>> > (or > >>> >>> > >> not) should be left to the user with reasonable default > behavior > >>> >>> > supplied > >>> >>> > >> by the spark commandline tooling and not forced on > applications > >>> that > >>> >>> > >> incorporate Spark. > >>> >>> > >> > >>> >>> > >> Thoughts/opinions? > >>> >>> > >> > >>> >>> > >> -- Paul > >>> >>> > >> -- > >>> >>> > >> p...@mult.ifario.us | Multifarious, Inc. | > http://mult.ifario.us/ > >>> >>> > > >>> >>> > >>> > >> > >> >