GitHub user srowen opened a pull request:

    https://github.com/apache/spark/pull/1547

    SPARK-2646. log4j initialization not quite compatible with log4j 2.x

    The logging code that handles log4j initialization leads to an stack 
overflow error when used with log4j 2.x, which has just been released. This 
occurs even a downstream project has correctly adjusted SLF4J bindings, and 
that is the right thing to do for log4j 2.x, since it is effectively a separate 
project from 1.x.
    
    Here is the relevant bit of Logging.scala:
    
    ```
      private def initializeLogging() {
        // If Log4j is being used, but is not initialized, load a default 
properties file
        val binder = StaticLoggerBinder.getSingleton
        val usingLog4j = 
binder.getLoggerFactoryClassStr.endsWith("Log4jLoggerFactory")
        val log4jInitialized = 
LogManager.getRootLogger.getAllAppenders.hasMoreElements
        if (!log4jInitialized && usingLog4j) {
          val defaultLogProps = "org/apache/spark/log4j-defaults.properties"
          Option(Utils.getSparkClassLoader.getResource(defaultLogProps)) match {
            case Some(url) =>
              PropertyConfigurator.configure(url)
              log.info(s"Using Spark's default log4j profile: $defaultLogProps")
            case None =>
              System.err.println(s"Spark was unable to load $defaultLogProps")
          }
        }
        Logging.initialized = true
    
        // Force a call into slf4j to initialize it. Avoids this happening from 
mutliple threads
        // and triggering this: 
http://mailman.qos.ch/pipermail/slf4j-dev/2010-April/002956.html
        log
      }
    ```
    
    The first minor issue is that there is a call to a logger inside this 
method, which is initializing logging. In this situation, it ends up causing 
the initialization to be called recursively until the stack overflow. It would 
be slightly tidier to log this only after Logging.initialized = true. Or not at 
all. But it's not the root problem, or else, it would not work at all now.
    
    The calls to log4j classes here always reference log4j 1.2 no matter what. 
For example, there is not getAllAppenders in log4j 2.x. That's fine. Really, 
"usingLog4j" means "using log4j 1.2" and "log4jInitialized" means "log4j 1.2 is 
initialized".
    
    usingLog4j should be false for log4j 2.x, because the initialization only 
matters for log4j 1.2. But, it's true, and that's the real issue. And 
log4jInitialized is always false, since calls to the log4j 1.2 API are stubs 
and no-ops in this setup, where the caller has swapped in log4j 2.x. Hence the 
loop.
    
    This is fixed, I believe, if "usingLog4j" can be false for log4j 2.x. The 
SLF4J static binding class has the same name for both versions, unfortunately, 
which causes the issue. However they're in different packages. For example, if 
the test included "... and begins with org.slf4j", it should work, as the SLF4J 
binding for log4j 2.x is provided by log4j 2.x at the moment, and is in package 
org.apache.logging.slf4j.
    
    Of course, I assume that SLF4J will eventually offer its own binding. I 
hope to goodness they at least name the binding class differently, or else this 
will again not work. But then some other check can probably be made.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/srowen/spark SPARK-2646

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/1547.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1547
    
----
commit 6f3c1d3269f0e3083aea538fb21cb50fbefe4588
Author: Sean Owen <[email protected]>
Date:   2014-07-23T13:59:10Z

    Remove log statement in logging initialization, and distinguish log4j 1.2 
from 2.0, to avoid stack overflow in initialization

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to