This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push: new ca8c269a1503 [SPARK-48124][CORE] Disable structured logging for Interpreters by default ca8c269a1503 is described below commit ca8c269a15037ce716449b5bba581e46aa8d7fea Author: Gengliang Wang <gengli...@apache.org> AuthorDate: Sat May 4 11:48:08 2024 -0700 [SPARK-48124][CORE] Disable structured logging for Interpreters by default ### What changes were proposed in this pull request? For interpreters, structured logging should be disabled by default to avoid generating mixed plain text and structured logs on the same console. spark-shell output with mixed plain text and structured logs: ``` Using Scala version 2.13.13 (OpenJDK 64-Bit Server VM, Java 17.0.9) Type in expressions to have them evaluated. Type :help for more information. {"ts":"2024-05-04T01:11:03.797Z","level":"WARN","msg":"Unable to load native-hadoop library for your platform... using builtin-java classes where applicable","logger":"NativeCodeLoader"} {"ts":"2024-05-04T01:11:04.104Z","level":"WARN","msg":"Service 'SparkUI' could not bind on port 4040. Attempting port 4041.","logger":"Utils"} Spark context Web UI available at http://10.10.114.155:4041/ Spark context available as 'sc' (master = local[*], app id = local-1714785064155). Spark session available as 'spark'. ``` After changes, all the output are plain text: ``` Type :help for more information. 24/05/03 18:11:35 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 24/05/03 18:11:35 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041. Spark context Web UI available at http://10.10.114.155:4041/ Spark context available as 'sc' (master = local[*], app id = local-1714785095892). Spark session available as 'spark'. ``` Note that submitting a spark application using `spark-submit` will still generates structured logs. ### Why are the changes needed? To avoid tgenerating mixed plain text and structured logs on the same console when using the Interpreters. ### Does this PR introduce _any_ user-facing change? No, this reverts to the behavior of Spark 3.5 ### How was this patch tested? Manual test ### Was this patch authored or co-authored using generative AI tooling? No Closes #46383 from gengliangwang/disableStructuredLogInRepl. Authored-by: Gengliang Wang <gengli...@apache.org> Signed-off-by: Dongjoon Hyun <dh...@apple.com> --- core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala | 7 ++++++- docs/configuration.md | 2 +- docs/core-migration-guide.md | 2 +- 3 files changed, 8 insertions(+), 3 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala b/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala index 9d93d91b7d2e..076aa8387dc5 100644 --- a/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala +++ b/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala @@ -68,11 +68,16 @@ private[spark] class SparkSubmit extends Logging { import SparkSubmit._ def doSubmit(args: Array[String]): Unit = { + val appArgs = parseArguments(args) + // For interpreters, structured logging is disabled by default to avoid generating mixed + // plain text and structured logs on the same console. + if (isShell(appArgs.primaryResource) || isSqlShell(appArgs.mainClass)) { + Logging.disableStructuredLogging() + } // Initialize logging if it hasn't been done yet. Keep track of whether logging needs to // be reset before the application starts. val uninitLog = initializeLogIfNecessary(true, silent = true) - val appArgs = parseArguments(args) if (appArgs.verbose) { logInfo(appArgs.toString) } diff --git a/docs/configuration.md b/docs/configuration.md index 7966aceccdea..d07decf02505 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -3673,7 +3673,7 @@ Spark uses [log4j](http://logging.apache.org/log4j/) for logging. You can config `log4j2.properties` file in the `conf` directory. One way to start is to copy the existing templates `log4j2.properties.template` or `log4j2.properties.pattern-layout-template` located there. ## Structured Logging -Starting from version 4.0.0, Spark has adopted the [JSON Template Layout](https://logging.apache.org/log4j/2.x/manual/json-template-layout.html) for logging, which outputs logs in JSON format. This format facilitates querying logs using Spark SQL with the JSON data source. Additionally, the logs include all Mapped Diagnostic Context (MDC) information for search and debugging purposes. +Starting from version 4.0.0, `spark-submit` has adopted the [JSON Template Layout](https://logging.apache.org/log4j/2.x/manual/json-template-layout.html) for logging, which outputs logs in JSON format. This format facilitates querying logs using Spark SQL with the JSON data source. Additionally, the logs include all Mapped Diagnostic Context (MDC) information for search and debugging purposes. To configure the layout of structured logging, start with the `log4j2.properties.template` file. diff --git a/docs/core-migration-guide.md b/docs/core-migration-guide.md index 5f3560883e59..597900630b3f 100644 --- a/docs/core-migration-guide.md +++ b/docs/core-migration-guide.md @@ -42,7 +42,7 @@ license: | - Since Spark 4.0, Spark uses the external shuffle service for deleting shuffle blocks for deallocated executors when the shuffle is no longer needed. To restore the legacy behavior, you can set `spark.shuffle.service.removeShuffle` to `false`. -- Since Spark 4.0, the default log4j output has shifted from plain text to JSON lines to enhance analyzability. To revert to plain text output, you can either set `spark.log.structuredLogging.enabled` to `false`, or use a custom log4j configuration. +- Since Spark 4.0, the default log4j output of `spark-submit` has shifted from plain text to JSON lines to enhance analyzability. To revert to plain text output, you can rename the file `conf/log4j2.properties.pattern-layout-template` as `conf/log4j2.properties`, or use a custom log4j configuration file. - Since Spark 4.0, Spark performs speculative executions less agressively with `spark.speculation.multiplier=3` and `spark.speculation.quantile=0.9`. To restore the legacy behavior, you can set `spark.speculation.multiplier=1.5` and `spark.speculation.quantile=0.75`. --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org