[spark] branch master updated: [SPARK-32772][SQL] Reduce log messages for spark-sql CLI

dongjoon Wed, 02 Sep 2020 13:34:14 -0700

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new ad6b887  [SPARK-32772][SQL] Reduce log messages for spark-sql CLI
ad6b887 is described below

commit ad6b887541bf90cc3ea830a1a3322b71ccdd80ee
Author: Kousuke Saruta <saru...@oss.nttdata.com>
AuthorDate: Wed Sep 2 13:31:06 2020 -0700

    [SPARK-32772][SQL] Reduce log messages for spark-sql CLI
    
    ### What changes were proposed in this pull request?
    
    This PR reduces log messages for spark-sql CLI like spark-shell and pyspark 
CLI.
    
    ### Why are the changes needed?
    
    When we launch spark-sql CLI, too many log messages are shown and it's 
sometimes difficult to find the result of query.
    ```
    spark-sql> SELECT now();
    20/09/02 00:11:45 INFO CodeGenerator: Code generated in 10.121625 ms
    20/09/02 00:11:45 INFO SparkContext: Starting job: main at 
NativeMethodAccessorImpl.java:0
    20/09/02 00:11:45 INFO DAGScheduler: Got job 0 (main at 
NativeMethodAccessorImpl.java:0) with 1 output partitions
    20/09/02 00:11:45 INFO DAGScheduler: Final stage: ResultStage 0 (main at 
NativeMethodAccessorImpl.java:0)
    20/09/02 00:11:45 INFO DAGScheduler: Parents of final stage: List()
    20/09/02 00:11:45 INFO DAGScheduler: Missing parents: List()
    20/09/02 00:11:45 INFO DAGScheduler: Submitting ResultStage 0 
(MapPartitionsRDD[2] at main at NativeMethodAccessorImpl.java:0), which has no 
missing parents
    20/09/02 00:11:45 INFO MemoryStore: Block broadcast_0 stored as values in 
memory (estimated size 6.3 KiB, free 366.3 MiB)
    20/09/02 00:11:45 INFO MemoryStore: Block broadcast_0_piece0 stored as 
bytes in memory (estimated size 3.2 KiB, free 366.3 MiB)
    20/09/02 00:11:45 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory 
on 192.168.1.204:42615 (size: 3.2 KiB, free: 366.3 MiB)
    20/09/02 00:11:45 INFO SparkContext: Created broadcast 0 from broadcast at 
DAGScheduler.scala:1348
    20/09/02 00:11:45 INFO DAGScheduler: Submitting 1 missing tasks from 
ResultStage 0 (MapPartitionsRDD[2] at main at NativeMethodAccessorImpl.java:0) 
(first 15 tasks are for partitions Vector(0))
    20/09/02 00:11:45 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks 
resource profile 0
    20/09/02 00:11:45 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 
0) (192.168.1.204, executor driver, partition 0, PROCESS_LOCAL, 7561 bytes) 
taskResourceAssignments Map()
    20/09/02 00:11:45 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
    20/09/02 00:11:45 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 
1446 bytes result sent to driver
    20/09/02 00:11:45 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 
0) in 238 ms on 192.168.1.204 (executor driver) (1/1)
    20/09/02 00:11:45 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks 
have all completed, from pool
    20/09/02 00:11:45 INFO DAGScheduler: ResultStage 0 (main at 
NativeMethodAccessorImpl.java:0) finished in 0.343 s
    20/09/02 00:11:45 INFO DAGScheduler: Job 0 is finished. Cancelling 
potential speculative or zombie tasks for this job
    20/09/02 00:11:45 INFO TaskSchedulerImpl: Killing all running tasks in 
stage 0: Stage finished
    20/09/02 00:11:45 INFO DAGScheduler: Job 0 finished: main at 
NativeMethodAccessorImpl.java:0, took 0.377489 s
    2020-09-02 00:11:45.07
    Time taken: 0.704 seconds, Fetched 1 row(s)
    20/09/02 00:11:45 INFO SparkSQLCLIDriver: Time taken: 0.704 seconds, 
Fetched 1 row(s)
    ```
    
    ### Does this PR introduce _any_ user-facing change?
    
    Yes. Log messages are reduced for spark-sql CLI like as follows.
    ```
    20/09/02 00:34:51 WARN NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
    Using Spark's default log4j profile: 
org/apache/spark/log4j-defaults.properties
    Setting default log level to "WARN".
    To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use 
setLogLevel(newLevel).
    20/09/02 00:34:53 WARN HiveConf: HiveConf of name hive.stats.jdbc.timeout 
does not exist
    20/09/02 00:34:53 WARN HiveConf: HiveConf of name hive.stats.retries.wait 
does not exist
    20/09/02 00:34:55 WARN ObjectStore: Version information not found in 
metastore. hive.metastore.schema.verification is not enabled so recording the 
schema version 2.3.0
    20/09/02 00:34:55 WARN ObjectStore: setMetaStoreSchemaVersion called but 
recording version is disabled: version = 2.3.0, comment = Set by MetaStore 
kou192.168.1.204
    Spark master: local[*], Application Id: local-1598974492822
    spark-sql> SELECT now();
    2020-09-02 00:35:05.258
    Time taken: 2.299 seconds, Fetched 1 row(s)
    ```
    
    ### How was this patch tested?
    
    Launched spark-sql CLI and confirmed that log messages are reduced as I 
paste above.
    
    Closes #29619 from sarutak/suppress-log-for-spark-sql.
    
    Authored-by: Kousuke Saruta <saru...@oss.nttdata.com>
    Signed-off-by: Dongjoon Hyun <dongj...@apache.org>
---
 .../scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala | 1 +
 1 file changed, 1 insertion(+)

diff --git 
a/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala
 
b/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala
index c7848af..6676223 100644
--- 
a/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala
+++ 
b/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala
@@ -59,6 +59,7 @@ private[hive] object SparkSQLCLIDriver extends Logging {
   private var transport: TSocket = _
   private final val SPARK_HADOOP_PROP_PREFIX = "spark.hadoop."
 
+  initializeLogIfNecessary(true)
   installSignalHandler()
 
   /**


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-32772][SQL] Reduce log messages for spark-sql CLI

Reply via email to