cloud-fan commented on pull request #28629: URL: https://github.com/apache/spark/pull/28629#issuecomment-652984201
Let's think about the use cases. There are many users who share one spark application (e.g. job server), and each user can set MDC properties so that he can collect spark logs for his spark jobs only. (correct me if I misunderstand the use case) If I were an end-user, I'd like to use one API to set MDC properties for both driver and executor logs. I don't see a use case that needs driver and executor have different MDC properties (except that the executor has an extra builtin MDC property: the task id). I agree that it's better to just inherit the MDC properties. And that should be the story for end-users. Spark should do two things internally: 1. update the driver side thread pools to inherit the MDC properties. 2. use local properties to propagate the MDC properties to the executor JVM. This PR does 1, but 2 is only partially done. We shouldn't ask users to manually set the local properties, but we should do it for them: when submitting a job, read the MDC properties of the current thread and set it to local properties. Another problem is the MDC property names. Since we just inherit it at driver-side, it's not possible to require the "mdc." prefix. Maybe we should also remove this pre-fix at the executor side. cc @Ngone51 ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
