Gengliang Wang created SPARK-47240:
--------------------------------------

             Summary: SPIP: Structured Spark Logging
                 Key: SPARK-47240
                 URL: https://issues.apache.org/jira/browse/SPARK-47240
             Project: Spark
          Issue Type: New Feature
          Components: Project Infra
    Affects Versions: 4.0.0
            Reporter: Gengliang Wang


This proposal aims to enhance Apache Spark's logging system by implementing 
structured logging. This transition will change the format of the default log 
files from plain text to JSON, making them more accessible and analyzable. The 
new logs will include crucial identifiers such as worker, executor, query, job, 
stage, and task IDs, thereby making the logs more informative and facilitating 
easier search and analysis.
h2. Current Logging Format

The current format of Spark logs is plain text, which can be challenging to 
parse and analyze efficiently. An example of the current log format is as 
follows:
{code:java}
23/11/29 17:53:44 ERROR TaskSchedulerImpl: Lost executor 289 on 100.116.29.4: 
Executor heartbeat timed out after 150300 ms{code}
h2. Proposed Structured Logging Format

The proposed change involves structuring the logs in JSON format, which 
organizes the log information into easily identifiable fields. Here is how the 
new structured log format would look:

 
{code:java}
{"ts": "23/11/29 17:53:44","level": "ERROR", "message": "Lost executor 289 on 
100.116.29.4: Executor heartbeat timed out after 150300 ms", "logger": 
"TaskSchedulerImpl","executor_id":  289, "host":  "100.116.29.4"} 
{code}
This format will enable users to upload and directly query 
driver/executor/master/worker log files using Spark SQL for more effective 
problem-solving and analysis, such as tracking executor losses or identifying 
faulty tasks:


 
{code:java}
spark.read.json("hdfs://hdfs_host/logs").createOrReplaceTempView("logs")
/* To get all the executor lost logs */
SELECT * FROM logs WHERE contains(message, 'Lost executor');
/* To get all the distributed logs about executor 289 */
SELECT * FROM logs WHERE executor_id = 289;
/* To get all the errors on host 100.116.29.4 */
SELECT * FROM logs WHERE host = "100.116.29.4" and log_level="ERROR";
{code}
 

 

SPIP doc: 
https://docs.google.com/document/d/1rATVGmFLNVLmtxSpWrEceYm7d-ocgu8ofhryVs4g3XU/edit?usp=sharing



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to