dasbh opened a new pull request #12472:
URL: https://github.com/apache/flink/pull/12472


   ## What is the purpose of the change
   
   * This pull request makes task instance to have jobName at the instance 
level and use the same for to populate MDC
   
   ## Problem statement:
   We use a shared session cluster (Standalone/Yarn) to execute jobs. All logs 
from the cluster are shipped using log aggregation framework (Logstash/Splunk) 
so that application diagnostic is easier.
   However, we are missing one vital information in the logline. i.e. Job name 
so that we can filter the logs for a single job.
   
   
   ## Brief change log
     - *The Task class now includes jobName filed, populated from 
JobInformation on construction, similar to jobId*
     - *When TaskExecutor runs Task, run method populates the jobName in MDC so 
that it can be used in logging pattern as %X{jobName}*
     - *ExecutorThreadFactory passes MDC context down to worker threads by 
wrapping runnable so MDC can be safely used on Cached Thread pool. This is the 
recommended way to pass the MDC from parent to Executor thread pool*
   
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
     - *Added unit test for Task which verifies jobName and MDC *
     - *Manually verified logging by changing WordCount Example to include 
%X{jobName} *
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): no
     - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
     - The serializers: no
     - The runtime per-record code paths (performance sensitive):  don't know
     - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: no
     - The S3 file system connector: no
   
   ## Documentation
   
     - Does this pull request introduce a new feature?  yes
     - If yes, how is the feature documented?   not documented yet
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to