Hi Gengliang, Thanks for taking the initiative to improve the Spark logging system. Transitioning to structured logs seems like a worthy way to enhance the ability to analyze and troubleshoot Spark jobs and hopefully the future integration with cloud logging systems. While "Structured Spark Logging" sounds good, I was wondering if we could consider an alternative name. Since we already use "Spark Structured Streaming", there might be a slight initial confusion with the terminology. I must confess it was my initial reaction so to speak.
Here are a few alternative names I came up with if I may - Spark Log Schema Initiative - Centralized Logging with Structured Data for Spark - Enhanced Spark Logging with Queryable Format These options all highlight the key aspects of your proposal namely; schema, centralized logging and queryability and might be even clearer for everyone at first glance. Cheers Mich Talebzadeh, Dad | Technologist | Solutions Architect | Engineer London United Kingdom view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* The information provided is correct to the best of my knowledge but of course cannot be guaranteed . It is essential to note that, as with any advice, quote "one test result is worth one-thousand expert opinions (Werner <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)". On Fri, 1 Mar 2024 at 10:07, Gengliang Wang <ltn...@gmail.com> wrote: > Hi All, > > I propose to enhance our logging system by transitioning to structured > logs. This initiative is designed to tackle the challenges of analyzing > distributed logs from drivers, workers, and executors by allowing them to > be queried using a fixed schema. The goal is to improve the informativeness > and accessibility of logs, making it significantly easier to diagnose > issues. > > Key benefits include: > > - Clarity and queryability of distributed log files. > - Continued support for log4j, allowing users to switch back to > traditional text logging if preferred. > > The improvement will simplify debugging and enhance productivity without > disrupting existing logging practices. The implementation is estimated to > take around 3 months. > > *SPIP*: > https://docs.google.com/document/d/1rATVGmFLNVLmtxSpWrEceYm7d-ocgu8ofhryVs4g3XU/edit?usp=sharing > *JIRA*: SPARK-47240 <https://issues.apache.org/jira/browse/SPARK-47240> > > Your comments and feedback would be greatly appreciated. >