cloud-fan commented on pull request #28629:
URL: https://github.com/apache/spark/pull/28629#issuecomment-652984201


   Let's think about the use cases. There are many users who share one spark 
application (e.g. job server), and each user can set MDC properties so that he 
can collect spark logs for his spark jobs only. (correct me if I misunderstand 
the use case)
   
   If I were an end-user, I'd like to use one API to set MDC properties for 
both driver and executor logs. I don't see a use case that needs driver and 
executor have different MDC properties (except that the executor has an extra 
builtin MDC property: the task id).
   
   I agree that it's better to just inherit the MDC properties. And that should 
be the story for end-users. Spark should do two things internally:
   1. update the driver side thread pools to inherit the MDC properties.
   2. use local properties to propagate the MDC properties to the executor JVM.
   
   This PR does 1, but 2 is only partially done. We shouldn't ask users to 
manually set the local properties, but we should do it for them: when 
submitting a job, read the MDC properties of the current thread and set it to 
local properties.
   
   Another problem is the MDC property names. Since we just inherit it at 
driver-side, it's not possible to require the "mdc." prefix. Maybe we should 
also remove this pre-fix at the executor side. cc @Ngone51 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to