pan3793 opened a new pull request, #38357: URL: https://github.com/apache/spark/pull/38357
### What changes were proposed in this pull request? Provide a flexible way on K8s for Driver and Executor by using env vars to configure external log service links(pattern) and attributes, on both live Spark UI and SHS. The full design doc is https://docs.google.com/document/d/1MfB39LD4B4Rp7MDRxZbMKMbdNSe6V6mBmMQ-gkCnM-0/edit?usp=sharing 1. Expose general attributes on K8s, for both Driver and Executor, which can be referred in log URLs pattern and will be persisted into event log. My proposed generic attributes are - APP_ID - KUBENETES_POD_NAME - KUBENETES_NAMESPACE 2. Allow using env vars to add custom log URLs and attributes, for both Driver and Executor. - Driver log URL: env vars w/ prefix SPARK_DRIVER_LOG_URL_ - Driver attribute: env vars w/ prefix SPARK_DRIVER_ATTRIBUTE_ - Executor log URL: env vars w/ prefix SPARK_LOG_URL_ - Executor attribute: env vars w/ prefix SPARK_EXECUTOR_ATTRIBUTE_ 3. Always do log URLs replacement for Driver before sending SparkListenerApplicationStart into the LiveListenerBus, so that the Driver could have the log URL replacement ability on live UI, as Executor does. 4. Always do log URLs replacement for Executor, - if spark.history.custom.executor.log.url is provided, as-is; - otherwise, use the value of log URL as pattern in case that user-provided log URL refers to the attributes. ### Why are the changes needed? Currently, there is no out-of-box log solution for Spark on K8s. For Spark on Yarn case, Spark provides stdout/stderr log links on Spark UI for the Driver and each Executor which redirects to the Yarn log pages, but for the resource manager which does not provide the out-of-box log services, like K8s, Spark has no log links on Spark UI. ### Does this PR introduce _any_ user-facing change? Yes, users could add custom log links in the Spark UI by configurations in Spark on K8s. ### How was this patch tested? <!-- If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible. If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future. If tests were not added, please describe why they were not added and/or why it was difficult to add. If benchmark tests were added, please run the benchmarks in GitHub Actions for the consistent environment, and the instructions could accord to: https://spark.apache.org/developer-tools.html#github-workflow-benchmarks. --> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
