Hi All, I propose to enhance our logging system by transitioning to structured logs. This initiative is designed to tackle the challenges of analyzing distributed logs from drivers, workers, and executors by allowing them to be queried using a fixed schema. The goal is to improve the informativeness and accessibility of logs, making it significantly easier to diagnose issues.
Key benefits include: - Clarity and queryability of distributed log files. - Continued support for log4j, allowing users to switch back to traditional text logging if preferred. The improvement will simplify debugging and enhance productivity without disrupting existing logging practices. The implementation is estimated to take around 3 months. *SPIP*: https://docs.google.com/document/d/1rATVGmFLNVLmtxSpWrEceYm7d-ocgu8ofhryVs4g3XU/edit?usp=sharing *JIRA*: SPARK-47240 <https://issues.apache.org/jira/browse/SPARK-47240> Your comments and feedback would be greatly appreciated.