Hi Gengliang,

Thanks for taking the initiative to improve the Spark logging system.
Transitioning to structured logs seems like a worthy way to enhance the
ability to analyze and troubleshoot Spark jobs and hopefully  the future
integration with cloud logging systems. While "Structured Spark Logging"
sounds good, I was wondering if we could consider an alternative name.
Since we already use "Spark Structured Streaming", there might be a slight
initial confusion with the terminology. I must confess it was my initial
reaction so to speak.

Here are a few alternative names I came up with if I may

   - Spark Log Schema Initiative
   - Centralized Logging with Structured Data for Spark
   - Enhanced Spark Logging with Queryable Format

These options all highlight the key aspects of your proposal namely;
schema, centralized logging and queryability and might be even clearer for
everyone at first glance.

Cheers

Mich Talebzadeh,
Dad | Technologist | Solutions Architect | Engineer
London
United Kingdom


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>


 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* The information provided is correct to the best of my
knowledge but of course cannot be guaranteed . It is essential to note
that, as with any advice, quote "one test result is worth one-thousand
expert opinions (Werner  <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von
Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".


On Fri, 1 Mar 2024 at 10:07, Gengliang Wang <ltn...@gmail.com> wrote:

> Hi All,
>
> I propose to enhance our logging system by transitioning to structured
> logs. This initiative is designed to tackle the challenges of analyzing
> distributed logs from drivers, workers, and executors by allowing them to
> be queried using a fixed schema. The goal is to improve the informativeness
> and accessibility of logs, making it significantly easier to diagnose
> issues.
>
> Key benefits include:
>
>    - Clarity and queryability of distributed log files.
>    - Continued support for log4j, allowing users to switch back to
>    traditional text logging if preferred.
>
> The improvement will simplify debugging and enhance productivity without
> disrupting existing logging practices. The implementation is estimated to
> take around 3 months.
>
> *SPIP*:
> https://docs.google.com/document/d/1rATVGmFLNVLmtxSpWrEceYm7d-ocgu8ofhryVs4g3XU/edit?usp=sharing
> *JIRA*: SPARK-47240 <https://issues.apache.org/jira/browse/SPARK-47240>
>
> Your comments and feedback would be greatly appreciated.
>

Reply via email to