[VOTE] SPIP: Structured Logging Framework for Apache Spark

2024-03-10 Thread Gengliang Wang
Hi all,

I'd like to start the vote for SPIP: Structured Logging Framework for
Apache Spark

References:

   - JIRA ticket 
   - SPIP doc
   

   - Discussion thread
   

Please vote on the SPIP for the next 72 hours:

[ ] +1: Accept the proposal as an official SPIP
[ ] +0
[ ] -1: I don’t think this is a good idea because …

Thanks!
Gengliang Wang


Re: [DISCUSS] SPIP: Structured Spark Logging

2024-03-10 Thread Gengliang Wang
Thanks everyone for the valuable feedback!

Given the generally positive feedback received, I plan to move forward by
initiating the voting thread. I encourage you to participate in the
upcoming thread.

Warm regards,
Gengliang

On Sat, Mar 9, 2024 at 12:55 PM Mich Talebzadeh 
wrote:

> Splendid. Thanks Gengliang
>
> Mich Talebzadeh,
> Dad | Technologist | Solutions Architect | Engineer
> London
> United Kingdom
>
>
>view my Linkedin profile
> 
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* The information provided is correct to the best of my
> knowledge but of course cannot be guaranteed . It is essential to note
> that, as with any advice, quote "one test result is worth one-thousand
> expert opinions (Werner  Von
> Braun )".
>
>
> On Sat, 9 Mar 2024 at 18:10, Gengliang Wang  wrote:
>
>> Hi Mich,
>>
>> Thanks for your suggestions. I agree that we should avoid confusion with
>> Spark Structured Streaming.
>>
>> So, I'll go with "Structured Logging Framework for Apache Spark". This
>> keeps the standard term "Structured Logging" and distinguishes it from
>> "Structured Streaming" clearly.
>>
>> Thanks for helping shape this!
>>
>> Best,
>> Gengliang
>>
>> On Sat, Mar 2, 2024 at 12:19 PM Mich Talebzadeh <
>> mich.talebza...@gmail.com> wrote:
>>
>>> Hi Gengliang,
>>>
>>> Thanks for taking the initiative to improve the Spark logging system.
>>> Transitioning to structured logs seems like a worthy way to enhance the
>>> ability to analyze and troubleshoot Spark jobs and hopefully  the future
>>> integration with cloud logging systems. While "Structured Spark Logging"
>>> sounds good, I was wondering if we could consider an alternative name.
>>> Since we already use "Spark Structured Streaming", there might be a slight
>>> initial confusion with the terminology. I must confess it was my initial
>>> reaction so to speak.
>>>
>>> Here are a few alternative names I came up with if I may
>>>
>>>- Spark Log Schema Initiative
>>>- Centralized Logging with Structured Data for Spark
>>>- Enhanced Spark Logging with Queryable Format
>>>
>>> These options all highlight the key aspects of your proposal namely;
>>> schema, centralized logging and queryability and might be even clearer for
>>> everyone at first glance.
>>>
>>> Cheers
>>>
>>> Mich Talebzadeh,
>>> Dad | Technologist | Solutions Architect | Engineer
>>> London
>>> United Kingdom
>>>
>>>
>>>view my Linkedin profile
>>> 
>>>
>>>
>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>
>>>
>>>
>>> *Disclaimer:* The information provided is correct to the best of my
>>> knowledge but of course cannot be guaranteed . It is essential to note
>>> that, as with any advice, quote "one test result is worth one-thousand
>>> expert opinions (Werner
>>> Von Braun
>>> )".
>>>
>>>
>>> On Fri, 1 Mar 2024 at 10:07, Gengliang Wang  wrote:
>>>
 Hi All,

 I propose to enhance our logging system by transitioning to structured
 logs. This initiative is designed to tackle the challenges of analyzing
 distributed logs from drivers, workers, and executors by allowing them to
 be queried using a fixed schema. The goal is to improve the informativeness
 and accessibility of logs, making it significantly easier to diagnose
 issues.

 Key benefits include:

- Clarity and queryability of distributed log files.
- Continued support for log4j, allowing users to switch back to
traditional text logging if preferred.

 The improvement will simplify debugging and enhance productivity
 without disrupting existing logging practices. The implementation is
 estimated to take around 3 months.

 *SPIP*:
 https://docs.google.com/document/d/1rATVGmFLNVLmtxSpWrEceYm7d-ocgu8ofhryVs4g3XU/edit?usp=sharing
 *JIRA*: SPARK-47240 

 Your comments and feedback would be greatly appreciated.

>>>