Re: 0.9.0 forces log4j usage

Koert Kuipers Fri, 07 Feb 2014 14:03:26 -0800

got it. that sounds reasonable


On Fri, Feb 7, 2014 at 2:31 PM, Patrick Wendell <pwend...@gmail.com> wrote:

> Koert - my suggestion was this. We let users use any slf4j backend
> they want. If we detect that they are using the log4j backend and
> *also* they didn't configure any log4j appenders, we set up some nice
> defaults for them. If they are using another backend, Spark doesn't
> try to modify the configuration at all.
>
> On Fri, Feb 7, 2014 at 11:14 AM, Koert Kuipers <ko...@tresata.com> wrote:
> > well "static binding" is probably the wrong terminology but you get the
> > idea. multiple backends are not allowed and cause an even uglier
> warning...
> >
> > see also here:
> > https://github.com/twitter/scalding/pull/636
> > and here:
> > https://groups.google.com/forum/#!topic/cascading-user/vYvnnN_15ls
> > all me being annoying and complaining about slf4j-log4j12 dependencies
> > (which did get removed).
> >
> >
> > On Fri, Feb 7, 2014 at 2:09 PM, Koert Kuipers <ko...@tresata.com> wrote:
> >
> >> the issue is that slf4j uses static binding. you can put only one slf4j
> >> backend on the classpath, and that's what it uses. more than one is not
> >> allowed.
> >>
> >> so you either keep the slf4j-log4j12 dependency for spark, and then you
> >> took away people's choice of slf4j backend which is considered bad form
> for
> >> a library, or you do not include it and then people will always get the
> big
> >> fat ugly warning and slf4j logging will not flow to log4j.
> >>
> >> including log4j itself is not necessary a problem i think?
> >>
> >>
> >> On Fri, Feb 7, 2014 at 1:11 PM, Patrick Wendell <pwend...@gmail.com
> >wrote:
> >>
> >>> This also seems relevant - but not my area of expertise (whether this
> >>> is a valid way to check this).
> >>>
> >>>
> >>>
> http://stackoverflow.com/questions/10505418/how-to-find-which-library-slf4j-has-bound-itself-to
> >>>
> >>> On Fri, Feb 7, 2014 at 10:08 AM, Patrick Wendell <pwend...@gmail.com>
> >>> wrote:
> >>> > Hey Guys,
> >>> >
> >>> > Thanks for explainning. Ya this is a problem - we didn't really know
> >>> > that people are using other slf4j backends, slf4j is in there for
> >>> > historical reasons but I think we may assume in a few places that
> >>> > log4j is being used and we should minimize those.
> >>> >
> >>> > We should patch this and get a fix into 0.9.1. So some solutions I
> see
> >>> are:
> >>> >
> >>> > (a) Add SparkConf option to disable this. I'm fine with this one.
> >>> >
> >>> > (b) Ask slf4j which backend is active and only try to enforce this
> >>> > default if we know slf4j is using log4j. Do either of you know if
> this
> >>> > is possible? Not sure if slf4j exposes this.
> >>> >
> >>> > (c) Just remove this default stuff. We'd rather not do this. The goal
> >>> > of this thing is to provide good usability for people who have linked
> >>> > against Spark and haven't done anything to configure logging. For
> >>> > beginners we try to minimize the assumptions about what else they
> know
> >>> > about, and I've found log4j configuration is a huge mental barrier
> for
> >>> > people who are getting started.
> >>> >
> >>> > Paul if you submit a patch doing (a) we can merge it in. If you have
> >>> > any idea if (b) is possible I prefer that one, but it may not be
> >>> > possible or might be brittle.
> >>> >
> >>> > - Patrick
> >>> >
> >>> > On Fri, Feb 7, 2014 at 6:36 AM, Koert Kuipers <ko...@tresata.com>
> >>> wrote:
> >>> >> Totally agree with Paul: a library should not pick the slf4j
> backend.
> >>> It
> >>> >> defeats the purpose of slf4j. That big ugly warning is there to
> alert
> >>> >> people that its their responsibility to pick the back end...
> >>> >> On Feb 7, 2014 3:55 AM, "Paul Brown" <p...@mult.ifario.us> wrote:
> >>> >>
> >>> >>> Hi, Patrick --
> >>> >>>
> >>> >>> From slf4j, you can either backend it into log4j (which is the way
> >>> that
> >>> >>> Spark is shipped) or you can route log4j through slf4j and then on
> to
> >>> a
> >>> >>> different backend (e.g., logback).  We're doing the latter and
> >>> manipulating
> >>> >>> the dependencies in the build because that's the way the enclosing
> >>> >>> application is set up.
> >>> >>>
> >>> >>> The issue with the current situation is that there's no way for an
> >>> end user
> >>> >>> to choose to *not* use the log4j backend.  (My short-term solution
> >>> was to
> >>> >>> use the Maven shade plugin to swap in a version of the Logging
> trait
> >>> with
> >>> >>> the body of that method commented out.)  In addition to the
> situation
> >>> with
> >>> >>> log4j-over-slf4j and the empty enumeration of ROOT appenders, you
> >>> might
> >>> >>> also run afoul of someone who intentionally configured log4j with
> an
> >>> empty
> >>> >>> set of appenders at the time that Spark is initializing.
> >>> >>>
> >>> >>> I'd be happy with any implementation that lets me choose my logging
> >>> >>> backend: override default behavior via system property, plug-in
> >>> >>> architecture, etc.  I do think it's reasonable to expect someone
> >>> digesting
> >>> >>> a substantial JDK-based system like Spark to understand how to
> >>> initialize
> >>> >>> logging -- surely they're using logging of some kind elsewhere in
> >>> their
> >>> >>> application -- but if you want the default behavior there as a
> >>> courtesy, it
> >>> >>> might be worth putting an INFO (versus a the glaring log4j WARN)
> >>> message on
> >>> >>> the output that says something like "Initialized default logging
> via
> >>> Log4J;
> >>> >>> pass -Dspark.logging.loadDefaultLogger=false to disable this
> >>> behavior." so
> >>> >>> that it's both convenient and explicit.
> >>> >>>
> >>> >>> Cheers.
> >>> >>> -- Paul
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>> --
> >>> >>> p...@mult.ifario.us | Multifarious, Inc. | http://mult.ifario.us/
> >>> >>>
> >>> >>>
> >>> >>> On Fri, Feb 7, 2014 at 12:05 AM, Patrick Wendell <
> pwend...@gmail.com>
> >>> >>> wrote:
> >>> >>>
> >>> >>> > A config option e.g. could just be to add:
> >>> >>> >
> >>> >>> > spark.logging.loadDefaultLogger (default true)
> >>> >>> > If set to true, Spark will try to initialize a log4j logger if
> none
> >>> is
> >>> >>> > detected. Otherwise Spark will not modify logging behavior.
> >>> >>> >
> >>> >>> > Then users could just set this to false if they have a logging
> >>> set-up
> >>> >>> > that conflicts with this.
> >>> >>> >
> >>> >>> > Maybe there is a nicer fix...
> >>> >>> >
> >>> >>> > On Fri, Feb 7, 2014 at 12:03 AM, Patrick Wendell <
> >>> pwend...@gmail.com>
> >>> >>> > wrote:
> >>> >>> > > Hey Paul,
> >>> >>> > >
> >>> >>> > > Thanks for digging this up. I worked on this feature and the
> >>> intent
> >>> >>> > > was to give users good default behavior if they didn't include
> any
> >>> >>> > > logging configuration on the classpath.
> >>> >>> > >
> >>> >>> > > The problem with assuming that CL tooling is going to fix the
> job
> >>> is
> >>> >>> > > that many people link against spark as a library and run their
> >>> >>> > > application using their own scripts. In this case the first
> thing
> >>> >>> > > people see when they run an application that links against
> Spark
> >>> was a
> >>> >>> > > big ugly logging warning.
> >>> >>> > >
> >>> >>> > > I'm not super familiar with log4j-over-slf4j, but this
> behavior of
> >>> >>> > > returning null for the appenders seems a little weird. What is
> >>> the use
> >>> >>> > > case for using this and not just directly use slf4j-log4j12
> like
> >>> Spark
> >>> >>> > > itself does?
> >>> >>> > >
> >>> >>> > > Did you have a more general fix for this in mind? Or was your
> >>> plan to
> >>> >>> > > just revert the existing behavior... We might be able to add a
> >>> >>> > > configuration option to disable this logging default stuff. Or
> we
> >>> >>> > > could just rip it out - but I'd like to avoid that if possible.
> >>> >>> > >
> >>> >>> > > - Patrick
> >>> >>> > >
> >>> >>> > > On Thu, Feb 6, 2014 at 11:41 PM, Paul Brown <
> p...@mult.ifario.us>
> >>> >>> wrote:
> >>> >>> > >> We have a few applications that embed Spark, and in 0.8.0 and
> >>> 0.8.1,
> >>> >>> we
> >>> >>> > >> were able to use slf4j, but 0.9.0 broke that and
> unintentionally
> >>> >>> forces
> >>> >>> > >> direct use of log4j as the logging backend.
> >>> >>> > >>
> >>> >>> > >> The issue is here in the org.apache.spark.Logging trait:
> >>> >>> > >>
> >>> >>> > >>
> >>> >>> >
> >>> >>>
> >>>
> https://github.com/apache/incubator-spark/blame/master/core/src/main/scala/org/apache/spark/Logging.scala#L107
> >>> >>> > >>
> >>> >>> > >> log4j-over-slf4j *always* returns an empty enumeration for
> >>> appenders
> >>> >>> to
> >>> >>> > the
> >>> >>> > >> ROOT logger:
> >>> >>> > >>
> >>> >>> > >>
> >>> >>> >
> >>> >>>
> >>>
> https://github.com/qos-ch/slf4j/blob/master/log4j-over-slf4j/src/main/java/org/apache/log4j/Category.java?source=c#L81
> >>> >>> > >>
> >>> >>> > >> And this causes an infinite loop and an eventual stack
> overflow.
> >>> >>> > >>
> >>> >>> > >> I'm happy to submit a Jira and a patch, but it would be
> >>> significant
> >>> >>> > enough
> >>> >>> > >> reversal of recent changes that it's probably worth discussing
> >>> before
> >>> >>> I
> >>> >>> > >> sink a half hour into it.  My suggestion would be that
> >>> initialization
> >>> >>> > (or
> >>> >>> > >> not) should be left to the user with reasonable default
> behavior
> >>> >>> > supplied
> >>> >>> > >> by the spark commandline tooling and not forced on
> applications
> >>> that
> >>> >>> > >> incorporate Spark.
> >>> >>> > >>
> >>> >>> > >> Thoughts/opinions?
> >>> >>> > >>
> >>> >>> > >> -- Paul
> >>> >>> > >> --
> >>> >>> > >> p...@mult.ifario.us | Multifarious, Inc. |
> http://mult.ifario.us/
> >>> >>> >
> >>> >>>
> >>>
> >>
> >>
>

Re: 0.9.0 forces log4j usage

Reply via email to