Github user sarutak commented on the pull request: https://github.com/apache/spark/pull/2250#issuecomment-55940146 @JoshRosen Thanks for your advise. I tried to use application id for metrics name and I found there were something difficulty. Problem 1. We need application id before creating SparkEnv For driver, we need application id before creating SparkEnv because some metrics sources are loaded and registered within SparkEnv.create. To be exact, in SparkEnv.create, an instance of MetricsSystem is created and the constructor of MetricsSystem invokes registerSource method, which loads sources from metrics.properties. Unfortunately, SparkEnv cannot create after before getting application id. Application id is gotten from SchedulerBackend (or its sub classes), but instances of SchedulerBackend cannot create before creating SparkEnv, for instance, TaskSchedulerImpl needs SparkEnv and TaskSchedulerImpl and SchedulerBackend are created at the same time. Problem 2. Difficult to pass application id to Executors via SparkConf Considering all of implementations of SchedulerBackends, we can get application id after invoking "taskScheduler.start()" in SparkContext. But, before finishing "taskScheduler.start()", Executors should be launched and extract SparkConf from DriverActor. In other words, Executors extract SparkConf before setting application id to SparkConf. So I have 2 solutions. 1st is this PR. This is a compromised solution. When we use YARN Cluster mode, we can get application id by SparkConf.get("spark.yarn.app.id") before SparkEnv is created and if we use other modes, we use System.currentTimeMillis instead. 2nd is #2432 . To register metrics sources after getting application id, SparkEnv doesn't register metrics sources and doesn't start MetricsSystem within SparkEnv#create when SparkEnv-creator is a driver so after getting application id, register metrics and start MetricsSystem instead. This is for problem 1. And for problem 2, when launching ExecutorBackends, launcher pass application id to ExecutorBackends. It doesn't consider Mesos because MesosSchedulerBackend doesn't return application id so if we use Mesos, System.currentTimeMillis is used instead of application id.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org