Thanks Paul - it isn't mean to be a "full solution" but just a fix for
the 0.9 branch - for the full solution there is another PR by Sean
Owen.

On Sun, Feb 9, 2014 at 11:35 PM, Paul Brown <p...@mult.ifario.us> wrote:
> Hi, Patrick --
>
> I gave that a go locally, and it works as desired.
>
> Best.
> -- Paul
>
> --
> p...@mult.ifario.us | Multifarious, Inc. | http://mult.ifario.us/
>
>
> On Fri, Feb 7, 2014 at 6:10 PM, Patrick Wendell <pwend...@gmail.com> wrote:
>
>> Ah okay sounds good. This is what I meant earlier by "You have
>> some other application that directly calls log4j."... i.e. you have
>> for historical reasons installed the log4j-over-slf4j.
>>
>> Would you mind trying out this fix and seeing if it works? This is
>> designed to be a hotfix for 0.9, not a general solution where we rip
>> out log4j from our published dependencies:
>>
>> https://github.com/apache/incubator-spark/pull/560/files
>>
>> - Patrick
>>
>> On Fri, Feb 7, 2014 at 5:57 PM, Paul Brown <p...@mult.ifario.us> wrote:
>> > Hi, Patrick --
>> >
>> > I forget which other component is responsible, but we're using the
>> > log4j-over-slf4j as part of an overall requirement to centralize logging,
>> > i.e., *someone* else is logging over log4j and we're pulling that in.
>> >  (There's also some jul logging from Jersey, etc.)
>> >
>> > Goals:
>> >
>> > - Fully control/capture all possible logging.  (God forbid we have to
>> grab
>> > System.out/err, but we'd do it if needed.)
>> > - Use the backend we like best at the moment.  (Happens to be logback.)
>> >
>> > Possible cases:
>> >
>> > - If Spark used Log4j at all, we would pull in that logging via
>> > log4j-over-slf4j.
>> > - If Spark used only slf4j and referenced no backend, we would use it
>> as-is
>> > although we'd still have the log4j-over-slf4j because of other libraries.
>> > - If Spark used only slf4j and referenced the slf4j-log4j12 backend, we
>> > would exclude that one dependency (via our POM).
>> >
>> > Best.
>> > -- Paul
>> >
>> >
>> > --
>> > p...@mult.ifario.us | Multifarious, Inc. | http://mult.ifario.us/
>> >
>> >
>> > On Fri, Feb 7, 2014 at 5:38 PM, Patrick Wendell <pwend...@gmail.com>
>> wrote:
>> >
>> >> Hey Paul,
>> >>
>> >> So if your goal is ultimately to output to logback. Then why don't you
>> >> just use slf4j and logback-classic.jar as described here [1]. Why
>> >> involve log4j-over-slf4j at all?
>> >>
>> >> Let's say we refactored the spark build so it didn't advertise
>> >> slf4j-log4j12 as a dependency. Would you still be using
>> >> log4j-over-slf4j... or is this just a "fix" to deal with the fact that
>> >> Spark is somewhat log4j dependent at this point.
>> >>
>> >> [1] http://www.slf4j.org/manual.html
>> >>
>> >> - Patrick
>> >>
>> >> On Fri, Feb 7, 2014 at 5:14 PM, Paul Brown <p...@mult.ifario.us> wrote:
>> >> > Hi, Patrick --
>> >> >
>> >> > That's close but not quite it.
>> >> >
>> >> > The issue that occurs is not the delegation loop mentioned in slf4j
>> >> > documentation.  The stack overflow is entirely within the code in the
>> >> Spark
>> >> > trait:
>> >> >
>> >> > at org.apache.spark.Logging$class.initializeLogging(Logging.scala:112)
>> >> > at
>> org.apache.spark.Logging$class.initializeIfNecessary(Logging.scala:97)
>> >> > at org.apache.spark.Logging$class.log(Logging.scala:36)
>> >> > at org.apache.spark.SparkEnv$.log(SparkEnv.scala:94)
>> >> >
>> >> >
>> >> > And then that repeats.
>> >> >
>> >> > As for our situation, we exclude the slf4j-log4j12 dependency when we
>> >> > import the Spark library (because we don't want to use log4j) and have
>> >> > log4j-over-slf4j already in place to ensure that all of the logging in
>> >> the
>> >> > overall application runs through slf4j and then out through logback.
>>  (We
>> >> > also, as another poster already mentioned, also force jcl and jul
>> through
>> >> > slf4j.)
>> >> >
>> >> > The zen of slf4j for libraries is that the library uses the slf4j API
>> and
>> >> > then the enclosing application can route logging as it sees fit.
>>  Spark
>> >> > master CLI would log via slf4j and include the slf4j-log4j12 backend;
>> >> same
>> >> > for Spark worker CLI.  Spark as a library (versus as a container)
>> would
>> >> not
>> >> > include any backend to the slf4j API and leave this up to the
>> >> application.
>> >> >  (FWIW, this would also avoid your log4j warning message.)
>> >> >
>> >> > But as I was saying before, I'd be happy with a situation where I can
>> >> avoid
>> >> > log4j being enabled or configured, and I think you'll find an existing
>> >> > choice of logging framework to be a common scenario for those
>> embedding
>> >> > Spark in other systems.
>> >> >
>> >> > Best.
>> >> > -- Paul
>> >> >
>> >> > --
>> >> > p...@mult.ifario.us | Multifarious, Inc. | http://mult.ifario.us/
>> >> >
>> >> >
>> >> > On Fri, Feb 7, 2014 at 3:01 PM, Patrick Wendell <pwend...@gmail.com>
>> >> wrote:
>> >> >
>> >> >> Paul,
>> >> >>
>> >> >> Looking back at your problem. I think it's the one here:
>> >> >> http://www.slf4j.org/codes.html#log4jDelegationLoop
>> >> >>
>> >> >> So let me just be clear what you are doing so I understand. You have
>> >> >> some other application that directly calls log4j. So you have to
>> >> >> include log4j-over-slf4j to route those logs through slf4j to
>> logback.
>> >> >>
>> >> >> At the same time you embed Spark in this application. In the past it
>> >> >> was fine, but now that Spark programmatic ally initializes log4j, it
>> >> >> screws up your application because log4j-over-slf4j doesn't work with
>> >> >> applications that do this explicilty as discussed here:
>> >> >> http://www.slf4j.org/legacy.html
>> >> >>
>> >> >> Correct?
>> >> >>
>> >> >> - Patrick
>> >> >>
>> >> >> On Fri, Feb 7, 2014 at 2:02 PM, Koert Kuipers <ko...@tresata.com>
>> >> wrote:
>> >> >> > got it. that sounds reasonable
>> >> >> >
>> >> >> >
>> >> >> > On Fri, Feb 7, 2014 at 2:31 PM, Patrick Wendell <
>> pwend...@gmail.com>
>> >> >> wrote:
>> >> >> >
>> >> >> >> Koert - my suggestion was this. We let users use any slf4j backend
>> >> >> >> they want. If we detect that they are using the log4j backend and
>> >> >> >> *also* they didn't configure any log4j appenders, we set up some
>> nice
>> >> >> >> defaults for them. If they are using another backend, Spark
>> doesn't
>> >> >> >> try to modify the configuration at all.
>> >> >> >>
>> >> >> >> On Fri, Feb 7, 2014 at 11:14 AM, Koert Kuipers <ko...@tresata.com
>> >
>> >> >> wrote:
>> >> >> >> > well "static binding" is probably the wrong terminology but you
>> get
>> >> >> the
>> >> >> >> > idea. multiple backends are not allowed and cause an even uglier
>> >> >> >> warning...
>> >> >> >> >
>> >> >> >> > see also here:
>> >> >> >> > https://github.com/twitter/scalding/pull/636
>> >> >> >> > and here:
>> >> >> >> >
>> https://groups.google.com/forum/#!topic/cascading-user/vYvnnN_15ls
>> >> >> >> > all me being annoying and complaining about slf4j-log4j12
>> >> dependencies
>> >> >> >> > (which did get removed).
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > On Fri, Feb 7, 2014 at 2:09 PM, Koert Kuipers <
>> ko...@tresata.com>
>> >> >> wrote:
>> >> >> >> >
>> >> >> >> >> the issue is that slf4j uses static binding. you can put only
>> one
>> >> >> slf4j
>> >> >> >> >> backend on the classpath, and that's what it uses. more than
>> one
>> >> is
>> >> >> not
>> >> >> >> >> allowed.
>> >> >> >> >>
>> >> >> >> >> so you either keep the slf4j-log4j12 dependency for spark, and
>> >> then
>> >> >> you
>> >> >> >> >> took away people's choice of slf4j backend which is considered
>> bad
>> >> >> form
>> >> >> >> for
>> >> >> >> >> a library, or you do not include it and then people will always
>> >> get
>> >> >> the
>> >> >> >> big
>> >> >> >> >> fat ugly warning and slf4j logging will not flow to log4j.
>> >> >> >> >>
>> >> >> >> >> including log4j itself is not necessary a problem i think?
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >> On Fri, Feb 7, 2014 at 1:11 PM, Patrick Wendell <
>> >> pwend...@gmail.com
>> >> >> >> >wrote:
>> >> >> >> >>
>> >> >> >> >>> This also seems relevant - but not my area of expertise
>> (whether
>> >> >> this
>> >> >> >> >>> is a valid way to check this).
>> >> >> >> >>>
>> >> >> >> >>>
>> >> >> >> >>>
>> >> >> >>
>> >> >>
>> >>
>> http://stackoverflow.com/questions/10505418/how-to-find-which-library-slf4j-has-bound-itself-to
>> >> >> >> >>>
>> >> >> >> >>> On Fri, Feb 7, 2014 at 10:08 AM, Patrick Wendell <
>> >> >> pwend...@gmail.com>
>> >> >> >> >>> wrote:
>> >> >> >> >>> > Hey Guys,
>> >> >> >> >>> >
>> >> >> >> >>> > Thanks for explainning. Ya this is a problem - we didn't
>> really
>> >> >> know
>> >> >> >> >>> > that people are using other slf4j backends, slf4j is in
>> there
>> >> for
>> >> >> >> >>> > historical reasons but I think we may assume in a few places
>> >> that
>> >> >> >> >>> > log4j is being used and we should minimize those.
>> >> >> >> >>> >
>> >> >> >> >>> > We should patch this and get a fix into 0.9.1. So some
>> >> solutions I
>> >> >> >> see
>> >> >> >> >>> are:
>> >> >> >> >>> >
>> >> >> >> >>> > (a) Add SparkConf option to disable this. I'm fine with this
>> >> one.
>> >> >> >> >>> >
>> >> >> >> >>> > (b) Ask slf4j which backend is active and only try to
>> enforce
>> >> this
>> >> >> >> >>> > default if we know slf4j is using log4j. Do either of you
>> know
>> >> if
>> >> >> >> this
>> >> >> >> >>> > is possible? Not sure if slf4j exposes this.
>> >> >> >> >>> >
>> >> >> >> >>> > (c) Just remove this default stuff. We'd rather not do this.
>> >> The
>> >> >> goal
>> >> >> >> >>> > of this thing is to provide good usability for people who
>> have
>> >> >> linked
>> >> >> >> >>> > against Spark and haven't done anything to configure
>> logging.
>> >> For
>> >> >> >> >>> > beginners we try to minimize the assumptions about what else
>> >> they
>> >> >> >> know
>> >> >> >> >>> > about, and I've found log4j configuration is a huge mental
>> >> barrier
>> >> >> >> for
>> >> >> >> >>> > people who are getting started.
>> >> >> >> >>> >
>> >> >> >> >>> > Paul if you submit a patch doing (a) we can merge it in. If
>> you
>> >> >> have
>> >> >> >> >>> > any idea if (b) is possible I prefer that one, but it may
>> not
>> >> be
>> >> >> >> >>> > possible or might be brittle.
>> >> >> >> >>> >
>> >> >> >> >>> > - Patrick
>> >> >> >> >>> >
>> >> >> >> >>> > On Fri, Feb 7, 2014 at 6:36 AM, Koert Kuipers <
>> >> ko...@tresata.com>
>> >> >> >> >>> wrote:
>> >> >> >> >>> >> Totally agree with Paul: a library should not pick the
>> slf4j
>> >> >> >> backend.
>> >> >> >> >>> It
>> >> >> >> >>> >> defeats the purpose of slf4j. That big ugly warning is
>> there
>> >> to
>> >> >> >> alert
>> >> >> >> >>> >> people that its their responsibility to pick the back
>> end...
>> >> >> >> >>> >> On Feb 7, 2014 3:55 AM, "Paul Brown" <p...@mult.ifario.us>
>> >> wrote:
>> >> >> >> >>> >>
>> >> >> >> >>> >>> Hi, Patrick --
>> >> >> >> >>> >>>
>> >> >> >> >>> >>> From slf4j, you can either backend it into log4j (which is
>> >> the
>> >> >> way
>> >> >> >> >>> that
>> >> >> >> >>> >>> Spark is shipped) or you can route log4j through slf4j and
>> >> then
>> >> >> on
>> >> >> >> to
>> >> >> >> >>> a
>> >> >> >> >>> >>> different backend (e.g., logback).  We're doing the latter
>> >> and
>> >> >> >> >>> manipulating
>> >> >> >> >>> >>> the dependencies in the build because that's the way the
>> >> >> enclosing
>> >> >> >> >>> >>> application is set up.
>> >> >> >> >>> >>>
>> >> >> >> >>> >>> The issue with the current situation is that there's no
>> way
>> >> for
>> >> >> an
>> >> >> >> >>> end user
>> >> >> >> >>> >>> to choose to *not* use the log4j backend.  (My short-term
>> >> >> solution
>> >> >> >> >>> was to
>> >> >> >> >>> >>> use the Maven shade plugin to swap in a version of the
>> >> Logging
>> >> >> >> trait
>> >> >> >> >>> with
>> >> >> >> >>> >>> the body of that method commented out.)  In addition to
>> the
>> >> >> >> situation
>> >> >> >> >>> with
>> >> >> >> >>> >>> log4j-over-slf4j and the empty enumeration of ROOT
>> appenders,
>> >> >> you
>> >> >> >> >>> might
>> >> >> >> >>> >>> also run afoul of someone who intentionally configured
>> log4j
>> >> >> with
>> >> >> >> an
>> >> >> >> >>> empty
>> >> >> >> >>> >>> set of appenders at the time that Spark is initializing.
>> >> >> >> >>> >>>
>> >> >> >> >>> >>> I'd be happy with any implementation that lets me choose
>> my
>> >> >> logging
>> >> >> >> >>> >>> backend: override default behavior via system property,
>> >> plug-in
>> >> >> >> >>> >>> architecture, etc.  I do think it's reasonable to expect
>> >> someone
>> >> >> >> >>> digesting
>> >> >> >> >>> >>> a substantial JDK-based system like Spark to understand
>> how
>> >> to
>> >> >> >> >>> initialize
>> >> >> >> >>> >>> logging -- surely they're using logging of some kind
>> >> elsewhere
>> >> >> in
>> >> >> >> >>> their
>> >> >> >> >>> >>> application -- but if you want the default behavior there
>> as
>> >> a
>> >> >> >> >>> courtesy, it
>> >> >> >> >>> >>> might be worth putting an INFO (versus a the glaring log4j
>> >> WARN)
>> >> >> >> >>> message on
>> >> >> >> >>> >>> the output that says something like "Initialized default
>> >> logging
>> >> >> >> via
>> >> >> >> >>> Log4J;
>> >> >> >> >>> >>> pass -Dspark.logging.loadDefaultLogger=false to disable
>> this
>> >> >> >> >>> behavior." so
>> >> >> >> >>> >>> that it's both convenient and explicit.
>> >> >> >> >>> >>>
>> >> >> >> >>> >>> Cheers.
>> >> >> >> >>> >>> -- Paul
>> >> >> >> >>> >>>
>> >> >> >> >>> >>>
>> >> >> >> >>> >>>
>> >> >> >> >>> >>>
>> >> >> >> >>> >>>
>> >> >> >> >>> >>>
>> >> >> >> >>> >>> --
>> >> >> >> >>> >>> p...@mult.ifario.us | Multifarious, Inc. |
>> >> >> http://mult.ifario.us/
>> >> >> >> >>> >>>
>> >> >> >> >>> >>>
>> >> >> >> >>> >>> On Fri, Feb 7, 2014 at 12:05 AM, Patrick Wendell <
>> >> >> >> pwend...@gmail.com>
>> >> >> >> >>> >>> wrote:
>> >> >> >> >>> >>>
>> >> >> >> >>> >>> > A config option e.g. could just be to add:
>> >> >> >> >>> >>> >
>> >> >> >> >>> >>> > spark.logging.loadDefaultLogger (default true)
>> >> >> >> >>> >>> > If set to true, Spark will try to initialize a log4j
>> >> logger if
>> >> >> >> none
>> >> >> >> >>> is
>> >> >> >> >>> >>> > detected. Otherwise Spark will not modify logging
>> behavior.
>> >> >> >> >>> >>> >
>> >> >> >> >>> >>> > Then users could just set this to false if they have a
>> >> logging
>> >> >> >> >>> set-up
>> >> >> >> >>> >>> > that conflicts with this.
>> >> >> >> >>> >>> >
>> >> >> >> >>> >>> > Maybe there is a nicer fix...
>> >> >> >> >>> >>> >
>> >> >> >> >>> >>> > On Fri, Feb 7, 2014 at 12:03 AM, Patrick Wendell <
>> >> >> >> >>> pwend...@gmail.com>
>> >> >> >> >>> >>> > wrote:
>> >> >> >> >>> >>> > > Hey Paul,
>> >> >> >> >>> >>> > >
>> >> >> >> >>> >>> > > Thanks for digging this up. I worked on this feature
>> and
>> >> the
>> >> >> >> >>> intent
>> >> >> >> >>> >>> > > was to give users good default behavior if they didn't
>> >> >> include
>> >> >> >> any
>> >> >> >> >>> >>> > > logging configuration on the classpath.
>> >> >> >> >>> >>> > >
>> >> >> >> >>> >>> > > The problem with assuming that CL tooling is going to
>> fix
>> >> >> the
>> >> >> >> job
>> >> >> >> >>> is
>> >> >> >> >>> >>> > > that many people link against spark as a library and
>> run
>> >> >> their
>> >> >> >> >>> >>> > > application using their own scripts. In this case the
>> >> first
>> >> >> >> thing
>> >> >> >> >>> >>> > > people see when they run an application that links
>> >> against
>> >> >> >> Spark
>> >> >> >> >>> was a
>> >> >> >> >>> >>> > > big ugly logging warning.
>> >> >> >> >>> >>> > >
>> >> >> >> >>> >>> > > I'm not super familiar with log4j-over-slf4j, but this
>> >> >> >> behavior of
>> >> >> >> >>> >>> > > returning null for the appenders seems a little weird.
>> >> What
>> >> >> is
>> >> >> >> >>> the use
>> >> >> >> >>> >>> > > case for using this and not just directly use
>> >> slf4j-log4j12
>> >> >> >> like
>> >> >> >> >>> Spark
>> >> >> >> >>> >>> > > itself does?
>> >> >> >> >>> >>> > >
>> >> >> >> >>> >>> > > Did you have a more general fix for this in mind? Or
>> was
>> >> >> your
>> >> >> >> >>> plan to
>> >> >> >> >>> >>> > > just revert the existing behavior... We might be able
>> to
>> >> >> add a
>> >> >> >> >>> >>> > > configuration option to disable this logging default
>> >> stuff.
>> >> >> Or
>> >> >> >> we
>> >> >> >> >>> >>> > > could just rip it out - but I'd like to avoid that if
>> >> >> possible.
>> >> >> >> >>> >>> > >
>> >> >> >> >>> >>> > > - Patrick
>> >> >> >> >>> >>> > >
>> >> >> >> >>> >>> > > On Thu, Feb 6, 2014 at 11:41 PM, Paul Brown <
>> >> >> >> p...@mult.ifario.us>
>> >> >> >> >>> >>> wrote:
>> >> >> >> >>> >>> > >> We have a few applications that embed Spark, and in
>> >> 0.8.0
>> >> >> and
>> >> >> >> >>> 0.8.1,
>> >> >> >> >>> >>> we
>> >> >> >> >>> >>> > >> were able to use slf4j, but 0.9.0 broke that and
>> >> >> >> unintentionally
>> >> >> >> >>> >>> forces
>> >> >> >> >>> >>> > >> direct use of log4j as the logging backend.
>> >> >> >> >>> >>> > >>
>> >> >> >> >>> >>> > >> The issue is here in the org.apache.spark.Logging
>> trait:
>> >> >> >> >>> >>> > >>
>> >> >> >> >>> >>> > >>
>> >> >> >> >>> >>> >
>> >> >> >> >>> >>>
>> >> >> >> >>>
>> >> >> >>
>> >> >>
>> >>
>> https://github.com/apache/incubator-spark/blame/master/core/src/main/scala/org/apache/spark/Logging.scala#L107
>> >> >> >> >>> >>> > >>
>> >> >> >> >>> >>> > >> log4j-over-slf4j *always* returns an empty
>> enumeration
>> >> for
>> >> >> >> >>> appenders
>> >> >> >> >>> >>> to
>> >> >> >> >>> >>> > the
>> >> >> >> >>> >>> > >> ROOT logger:
>> >> >> >> >>> >>> > >>
>> >> >> >> >>> >>> > >>
>> >> >> >> >>> >>> >
>> >> >> >> >>> >>>
>> >> >> >> >>>
>> >> >> >>
>> >> >>
>> >>
>> https://github.com/qos-ch/slf4j/blob/master/log4j-over-slf4j/src/main/java/org/apache/log4j/Category.java?source=c#L81
>> >> >> >> >>> >>> > >>
>> >> >> >> >>> >>> > >> And this causes an infinite loop and an eventual
>> stack
>> >> >> >> overflow.
>> >> >> >> >>> >>> > >>
>> >> >> >> >>> >>> > >> I'm happy to submit a Jira and a patch, but it would
>> be
>> >> >> >> >>> significant
>> >> >> >> >>> >>> > enough
>> >> >> >> >>> >>> > >> reversal of recent changes that it's probably worth
>> >> >> discussing
>> >> >> >> >>> before
>> >> >> >> >>> >>> I
>> >> >> >> >>> >>> > >> sink a half hour into it.  My suggestion would be
>> that
>> >> >> >> >>> initialization
>> >> >> >> >>> >>> > (or
>> >> >> >> >>> >>> > >> not) should be left to the user with reasonable
>> default
>> >> >> >> behavior
>> >> >> >> >>> >>> > supplied
>> >> >> >> >>> >>> > >> by the spark commandline tooling and not forced on
>> >> >> >> applications
>> >> >> >> >>> that
>> >> >> >> >>> >>> > >> incorporate Spark.
>> >> >> >> >>> >>> > >>
>> >> >> >> >>> >>> > >> Thoughts/opinions?
>> >> >> >> >>> >>> > >>
>> >> >> >> >>> >>> > >> -- Paul
>> >> >> >> >>> >>> > >> --
>> >> >> >> >>> >>> > >> p...@mult.ifario.us | Multifarious, Inc. |
>> >> >> >> http://mult.ifario.us/
>> >> >> >> >>> >>> >
>> >> >> >> >>> >>>
>> >> >> >> >>>
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >>
>> >> >>
>> >>
>>

Reply via email to