(spark) branch master updated: [SPARK-48124][CORE] Disable structured logging for Interpreters by default

dongjoon Sat, 04 May 2024 11:48:29 -0700

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new ca8c269a1503 [SPARK-48124][CORE] Disable structured logging for 
Interpreters by default
ca8c269a1503 is described below

commit ca8c269a15037ce716449b5bba581e46aa8d7fea
Author: Gengliang Wang <gengli...@apache.org>
AuthorDate: Sat May 4 11:48:08 2024 -0700

    [SPARK-48124][CORE] Disable structured logging for Interpreters by default
    
    ### What changes were proposed in this pull request?
    
    For interpreters, structured logging should be disabled by default to avoid 
generating mixed plain text and structured logs on the same console.
    
    spark-shell output with mixed plain text and structured logs:
    ```
    Using Scala version 2.13.13 (OpenJDK 64-Bit Server VM, Java 17.0.9)
    
    Type in expressions to have them evaluated.
    
    Type :help for more information.
    
    {"ts":"2024-05-04T01:11:03.797Z","level":"WARN","msg":"Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable","logger":"NativeCodeLoader"} 
{"ts":"2024-05-04T01:11:04.104Z","level":"WARN","msg":"Service 'SparkUI' could 
not bind on port 4040. Attempting port 4041.","logger":"Utils"}
    Spark context Web UI available at http://10.10.114.155:4041/
    
    Spark context available as 'sc' (master = local[*], app id = 
local-1714785064155).
    
    Spark session available as 'spark'.
    ```
    
    After changes, all the output are plain text:
    ```
    Type :help for more information.
    
    24/05/03 18:11:35 WARN NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
    
    24/05/03 18:11:35 WARN Utils: Service 'SparkUI' could not bind on port 
4040. Attempting port 4041.
    
    Spark context Web UI available at http://10.10.114.155:4041/
    
    Spark context available as 'sc' (master = local[*], app id = 
local-1714785095892).
    
    Spark session available as 'spark'.
    ```
    
    Note that submitting a spark application using `spark-submit` will still 
generates structured logs.
    ### Why are the changes needed?
    
    To avoid tgenerating mixed plain text and structured logs on the same 
console when using the Interpreters.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No, this reverts to the behavior of Spark 3.5
    
    ### How was this patch tested?
    
    Manual test
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No
    
    Closes #46383 from gengliangwang/disableStructuredLogInRepl.
    
    Authored-by: Gengliang Wang <gengli...@apache.org>
    Signed-off-by: Dongjoon Hyun <dh...@apple.com>
---
 core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala | 7 ++++++-
 docs/configuration.md                                         | 2 +-
 docs/core-migration-guide.md                                  | 2 +-
 3 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala 
b/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
index 9d93d91b7d2e..076aa8387dc5 100644
--- a/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
@@ -68,11 +68,16 @@ private[spark] class SparkSubmit extends Logging {
   import SparkSubmit._
 
   def doSubmit(args: Array[String]): Unit = {
+    val appArgs = parseArguments(args)
+    // For interpreters, structured logging is disabled by default to avoid 
generating mixed
+    // plain text and structured logs on the same console.
+    if (isShell(appArgs.primaryResource) || isSqlShell(appArgs.mainClass)) {
+      Logging.disableStructuredLogging()
+    }
     // Initialize logging if it hasn't been done yet. Keep track of whether 
logging needs to
     // be reset before the application starts.
     val uninitLog = initializeLogIfNecessary(true, silent = true)
 
-    val appArgs = parseArguments(args)
     if (appArgs.verbose) {
       logInfo(appArgs.toString)
     }
diff --git a/docs/configuration.md b/docs/configuration.md
index 7966aceccdea..d07decf02505 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -3673,7 +3673,7 @@ Spark uses [log4j](http://logging.apache.org/log4j/) for 
logging. You can config
 `log4j2.properties` file in the `conf` directory. One way to start is to copy 
the existing templates `log4j2.properties.template` or 
`log4j2.properties.pattern-layout-template` located there.
 
 ## Structured Logging
-Starting from version 4.0.0, Spark has adopted the [JSON Template 
Layout](https://logging.apache.org/log4j/2.x/manual/json-template-layout.html) 
for logging, which outputs logs in JSON format. This format facilitates 
querying logs using Spark SQL with the JSON data source. Additionally, the logs 
include all Mapped Diagnostic Context (MDC) information for search and 
debugging purposes.
+Starting from version 4.0.0, `spark-submit` has adopted the [JSON Template 
Layout](https://logging.apache.org/log4j/2.x/manual/json-template-layout.html) 
for logging, which outputs logs in JSON format. This format facilitates 
querying logs using Spark SQL with the JSON data source. Additionally, the logs 
include all Mapped Diagnostic Context (MDC) information for search and 
debugging purposes.
 
 To configure the layout of structured logging, start with the 
`log4j2.properties.template` file.
 
diff --git a/docs/core-migration-guide.md b/docs/core-migration-guide.md
index 5f3560883e59..597900630b3f 100644
--- a/docs/core-migration-guide.md
+++ b/docs/core-migration-guide.md
@@ -42,7 +42,7 @@ license: |
 
 - Since Spark 4.0, Spark uses the external shuffle service for deleting 
shuffle blocks for deallocated executors when the shuffle is no longer needed. 
To restore the legacy behavior, you can set 
`spark.shuffle.service.removeShuffle` to `false`.
 
-- Since Spark 4.0, the default log4j output has shifted from plain text to 
JSON lines to enhance analyzability. To revert to plain text output, you can 
either set `spark.log.structuredLogging.enabled` to `false`, or use a custom 
log4j configuration.
+- Since Spark 4.0, the default log4j output of `spark-submit` has shifted from 
plain text to JSON lines to enhance analyzability. To revert to plain text 
output, you can rename the file 
`conf/log4j2.properties.pattern-layout-template` as `conf/log4j2.properties`, 
or use a custom log4j configuration file.
 
 - Since Spark 4.0, Spark performs speculative executions less agressively with 
`spark.speculation.multiplier=3` and `spark.speculation.quantile=0.9`. To 
restore the legacy behavior, you can set `spark.speculation.multiplier=1.5` and 
`spark.speculation.quantile=0.75`.
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch master updated: [SPARK-48124][CORE] Disable structured logging for Interpreters by default

Reply via email to