Hi, I would like to add the applicationId to all logs produced by Spark through Log4j. Consider that I have a cluster with several jobs running in it, so the presence of the applicationId would be useful to logically divide them.
I have found a partial solution. If I change the layout of the PatternLayout logger, I can add the print of the ThreadContext (see here <https://logging.apache.org/log4j/2.x/manual/thread-context.html>), which can be used to add through MDC the information of the applicationId (see here <https://stackoverflow.com/questions/54706582/output-spark-application-id-in-the-logs-with-log4j>). This works for the driver, but I would like to add this information at Spark application startup, both for driver and workers. Notice that I'm working with a managed environment (Databricks), so I'm partially limited in cluster management. One workaround to execute the put of the parameter through MDC to all workers is to use a broadcast variable and perform an action with it, but I don't think it is stable, considering that this should work also if the worker machine restarts or is substituted. Thank you