[GitHub] spark pull request #19013: [SPARK-21728][core] Allow SparkSubmit to use Logg...

vanzin Mon, 21 Aug 2017 16:27:23 -0700

GitHub user vanzin opened a pull request:

    https://github.com/apache/spark/pull/19013


    [SPARK-21728][core] Allow SparkSubmit to use Logging.

    This change initializes logging when SparkSubmit runs, using
    a configuration that should avoid printing log messages as
    much as possible with most configurations, and adds code to
    restore the Spark logging system to as close as possible to
    its initial state, so the Spark app being run can re-initialize
    logging with its own configuration.
    
    With that feature, some duplicate code in SparkSubmit can now
    be replaced with the existing methods in the Utils class, which
    could not be used before because they initialized logging. As part
    of that I also did some minor refactoring, moving methods that
    should really belong in DependencyUtils.
    
    The change also shuffles some code in SparkHadoopUtil so that
    SparkSubmit can create a Hadoop config like the rest of Spark
    code, respecting the user's Spark configuration.
    
    The behavior was verified running spark-shell, pyspark and
    normal applications, then verifying the logging behavior,
    with and without dependency downloads.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/vanzin/spark SPARK-21728

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19013.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19013
    
----
commit 400f0ad620e126fcba5ffdfe9979643980687ce4
Author: Marcelo Vanzin <[email protected]>
Date:   2017-08-15T23:44:54Z

    [SPARK-21728][core] Allow SparkSubmit to use Logging.
    
    This change initializes logging when SparkSubmit runs, using
    a configuration that should avoid printing log messages as
    much as possible with most configurations, and adds code to
    restore the Spark logging system to as close as possible to
    its initial state, so the Spark app being run can re-initialize
    logging with its own configuration.
    
    With that feature, some duplicate code in SparkSubmit can now
    be replaced with the existing methods in the Utils class, which
    could not be used before because they initialized logging. As part
    of that I also did some minor refactoring, moving methods that
    should really belong in DependencyUtils.
    
    The change also shuffles some code in SparkHadoopUtil so that
    SparkSubmit can create a Hadoop config like the rest of Spark
    code, respecting the user's Spark configuration.
    
    The behavior was verified running spark-shell, pyspark and
    normal applications, then verifying the logging behavior,
    with and without dependency downloads.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #19013: [SPARK-21728][core] Allow SparkSubmit to use Logg...

Reply via email to