Hi, I'm running Spark(1.6.1) on YARN(2.5.1), cluster mode. It's taking 20+ seconds for application to move from ACCEPTED to RUNNING state, here's logs 16/04/21 09:06:56 INFO impl.YarnClientImpl: Submitted application application_1461229289298_0001 16/04/21 09:06:57 INFO yarn.Client: Application report for application_1461229289298_0001 (state: ACCEPTED) 16/04/21 09:06:57 INFO yarn.Client: client token: N/A diagnostics: N/A ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1461229616315 final status: UNDEFINED tracking URL: http://test-spark-m2:8088/proxy/application_1461229289298_0001/ user: hadoop 16/04/21 09:06:58 INFO yarn.Client: Application report for application_1461229289298_0001 (state: ACCEPTED) ... 16/04/21 09:07:31 INFO yarn.Client: Application report for application_1461229289298_0001 (state: ACCEPTED) 16/04/21 09:07:32 INFO yarn.Client: Application report for application_1461229289298_0001 (state: RUNNING) 16/04/21 09:07:32 INFO yarn.Client: client token: N/A diagnostics: N/A ApplicationMaster host: 10.0.8.51 ApplicationMaster RPC port: 0 queue: default start time: 1461229616315 final status: UNDEFINED tracking URL: http://test-spark-m2:8088/proxy/application_1461229289298_0001/ user: hadoop 16/04/21 09:07:33 INFO yarn.Client: Application report for application_1461229289298_0001 (state: RUNNING)
And here is the ApplicationMaster logs 16/04/21 09:07:19 INFO yarn.ApplicationMaster: Registered signal handlers for [TERM, HUP, INT] 16/04/21 09:07:22 INFO yarn.ApplicationMaster: ApplicationAttemptId: appattempt_1461229289298_0001_000001 16/04/21 09:07:24 INFO spark.SecurityManager: Changing view acls to: hadoop 16/04/21 09:07:24 INFO spark.SecurityManager: Changing modify acls to: hadoop 16/04/21 09:07:24 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop); users with modify permissions: Set(hadoop) 16/04/21 09:07:25 INFO yarn.ApplicationMaster: Starting the user application in a separate Thread 16/04/21 09:07:25 INFO yarn.ApplicationMaster: Waiting for spark context initialization 16/04/21 09:07:25 INFO yarn.ApplicationMaster: Waiting for spark context initialization ... 16/04/21 09:07:26 INFO spark.SparkContext: Running Spark version 1.6.1 16/04/21 09:07:26 INFO spark.SecurityManager: Changing view acls to: hadoop 16/04/21 09:07:26 INFO spark.SecurityManager: Changing modify acls to: hadoop 16/04/21 09:07:26 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop); users with modify permissions: Set(hadoop) 16/04/21 09:07:27 INFO util.Utils: Successfully started service 'sparkDriver' on port 57808. 16/04/21 09:07:28 INFO slf4j.Slf4jLogger: Slf4jLogger started 16/04/21 09:07:28 INFO Remoting: Starting remoting 16/04/21 09:07:28 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@10.0.8.51:41651] 16/04/21 09:07:28 INFO util.Utils: Successfully started service 'sparkDriverActorSystem' on port 41651. 16/04/21 09:07:29 INFO spark.SparkEnv: Registering MapOutputTracker 16/04/21 09:07:29 INFO spark.SparkEnv: Registering BlockManagerMaster 16/04/21 09:07:29 INFO storage.DiskBlockManager: Created local directory at /tmp/hadoop/nm-local-dir/usercache/hadoop/appcache/application_1461229289298_0001/blockmgr-84f8f2d5-e749-46d8-b605-2f615193421c 16/04/21 09:07:29 INFO storage.MemoryStore: MemoryStore started with capacity 1140.4 MB 16/04/21 09:07:29 INFO spark.SparkEnv: Registering OutputCommitCoordinator 16/04/21 09:07:29 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 16/04/21 09:07:29 INFO server.Server: jetty-8.y.z-SNAPSHOT 16/04/21 09:07:30 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:56199 16/04/21 09:07:30 INFO util.Utils: Successfully started service 'SparkUI' on port 56199. 16/04/21 09:07:30 INFO ui.SparkUI: Started SparkUI at http://10.0.8.51:56199 16/04/21 09:07:30 INFO cluster.YarnClusterScheduler: Created YarnClusterScheduler 16/04/21 09:07:30 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 51641. 16/04/21 09:07:30 INFO netty.NettyBlockTransferService: Server created on 51641 16/04/21 09:07:30 INFO storage.BlockManagerMaster: Trying to register BlockManager 16/04/21 09:07:30 INFO storage.BlockManagerMasterEndpoint: Registering block manager 10.0.8.51:51641 with 1140.4 MB RAM, BlockManagerId(driver, 10.0.8.51, 51641) 16/04/21 09:07:30 INFO storage.BlockManagerMaster: Registered BlockManager 16/04/21 09:07:31 INFO scheduler.EventLoggingListener: Logging events to hdfs:///spark-events/application_1461229289298_0001_1 16/04/21 09:07:31 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(spark://YarnAM@10.0.8.51:57808) 16/04/21 09:07:32 INFO yarn.YarnRMClient: Registering the ApplicationMaster 16/04/21 09:07:32 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2 16/04/21 09:07:32 INFO yarn.YarnAllocator: Will request 2 executor containers, each with 2 cores and 2432 MB memory including 384 MB overhead 16/04/21 09:07:32 INFO yarn.YarnAllocator: Container request (host: Any, capability: <memory:2432, vCores:2>) 16/04/21 09:07:32 INFO yarn.YarnAllocator: Container request (host: Any, capability: <memory:2432, vCores:2>) 16/04/21 09:07:32 INFO yarn.ApplicationMaster: Started progress reporter thread with (heartbeat : 3000, initial allocation : 200) intervals 16/04/21 09:07:32 INFO impl.AMRMClientImpl: Received new token for : test-spark-w3:43905 16/04/21 09:07:32 INFO yarn.YarnAllocator: Launching container container_1461229289298_0001_01_000002 for on host test-spark-w3 16/04/21 09:07:32 INFO yarn.YarnAllocator: Launching ExecutorRunnable. driverUrl: spark://CoarseGrainedScheduler@10.0.8.51:57808, executorHostname: test-spark-w3 16/04/21 09:07:32 INFO yarn.YarnAllocator: Received 1 containers from YARN, launching executors on 1 of them. 16/04/21 09:07:32 INFO yarn.ExecutorRunnable: Starting Executor Container 16/04/21 09:07:32 INFO impl.ContainerManagementProtocolProxy: yarn.client.max-nodemanagers-proxies : 500 16/04/21 09:07:32 INFO yarn.ExecutorRunnable: Setting up ContainerLaunchContext 16/04/21 09:07:32 INFO yarn.ExecutorRunnable: Preparing Local resources YARN ResourceManager logs 2016-04-21 09:06:56,315 INFO org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Application with id 1 submitted by user hadoop 2016-04-21 09:06:56,316 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Storing application with id application_1461229289298_0001 2016-04-21 09:06:56,316 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hadoop IP=10.0.8.49 OPERATION=Submit Application Request TARGET=ClientRMService RESULT=SUCCESS APPID=application_1461229289298_0001 2016-04-21 09:06:56,317 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1461229289298_0001 State change from NEW to NEW_SAVING 2016-04-21 09:06:56,321 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Storing info for app: application_1461229289298_0001 2016-04-21 09:06:56,356 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1461229289298_0001 State change from NEW_SAVING to SUBMITTED 2016-04-21 09:06:56,364 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1461229289298_0001 State change from SUBMITTED to ACCEPTED 2016-04-21 09:06:56,365 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Registering app attempt : appattempt_1461229289298_0001_000001 2016-04-21 09:06:56,375 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1461229289298_0001_000001 State change from NEW to SUBMITTED 2016-04-21 09:06:56,409 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1461229289298_0001_000001 State change from SUBMITTED to SCHEDULED 2016-04-21 09:06:56,530 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1461229289298_0001_01_000001 Container Transitioned from NEW to ALLOCATED 2016-04-21 09:06:56,541 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1461229289298_0001_01_000001 Container Transitioned from ALLOCATED to ACQUIRED 2016-04-21 09:06:56,548 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1461229289298_0001_000001 State change from SCHEDULED to ALLOCATED_SAVING 2016-04-21 09:06:56,558 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1461229289298_0001_000001 State change from ALLOCATED_SAVING to ALLOCATED 2016-04-21 09:06:57,351 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1461229289298_0001_000001 State change from ALLOCATED to LAUNCHED 2016-04-21 09:06:57,530 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1461229289298_0001_01_000001 Container Transitioned from ACQUIRED to RUNNING From the logs: 16/04/21 09:06:56,364 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1461229289298_0001 State change from SUBMITTED to ACCEPTED 16/04/21 09:06:57 INFO yarn.Client: Application report for application_1461229289298_0001 (state: ACCEPTED) 16/04/21 09:06:57,351 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1461229289298_0001_000001 State change from ALLOCATED to LAUNCHED 16/04/21 09:06:57,530 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1461229289298_0001_01_000001 Container Transitioned from ACQUIRED to RUNNING 16/04/21 09:07:19 INFO yarn.ApplicationMaster: Registered signal handlers for [TERM, HUP, INT] 16/04/21 09:07:26 INFO spark.SparkContext: Running Spark version 1.6.1 16/04/21 09:07:32 INFO yarn.YarnRMClient: Registering the ApplicationMaster 16/04/21 09:07:32 INFO yarn.Client: Application report for application_1461229289298_0001 (state: RUNNING) According to above logs, container_1461229289298_0001_01_000001 for ApplicationMaster moved to RUNNING state at 09:06:57,530 , but container logs only appeared at 09:07:19 with yarn.ApplicationMaster: Registered signal handlers for [TERM, HUP, INT] message. So what is this delay, why it's not starting immediately after 09:06:57,530, when it was reported running in ResourceManager logs. SparkContext created only at 09:07:26, so it took almost 30 seconds, is it expected time to create SparkContext? The application above is the only one in the Spark cluster, and there is enough resources. Any ideas why it's happening? Thank you.