Hi All:I was submitting a spark_program.jar to `spark on yarn cluster` on a 
driver machine with yarn-client mode. Here is the spark-submit command I used:
./spark-submit --master yarn-client --class 
com.charlie.spark.grax.OldFollowersExample --queue dt_spark 
~/script/spark-flume-test-0.1-SNAPSHOT-hadoop2.0.0-mr1-cdh4.2.1.jarThe queue 
`dt_spark` was free, and the program was submitted succesfully and running on 
the cluster.  But on console, it showed repeatedly that:
14/11/18 15:11:48 WARN YarnClientClusterScheduler: Initial job has not accepted 
any resources; check your cluster UI to ensure that workers are registered and 
have sufficient memory
Checked the cluster UI logs, I find no errors:
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/home/disk5/yarn/usercache/linqili/filecache/6957209742046754908/spark-assembly-1.0.2-hadoop2.0.0-cdh4.2.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/home/hadoop/hadoop-2.0.0-cdh4.2.1/share/hadoop/common/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
14/11/18 14:28:16 INFO SecurityManager: Changing view acls to: hadoop,linqili
14/11/18 14:28:16 INFO SecurityManager: SecurityManager: authentication 
disabled; ui acls disabled; users with view permissions: Set(hadoop, linqili)
14/11/18 14:28:17 INFO Slf4jLogger: Slf4jLogger started
14/11/18 14:28:17 INFO Remoting: Starting remoting
14/11/18 14:28:17 INFO Remoting: Remoting started; listening on addresses 
:[akka.tcp://sparkyar...@longzhou-hdp3.lz.dscc:37187]
14/11/18 14:28:17 INFO Remoting: Remoting now listens on addresses: 
[akka.tcp://sparkyar...@longzhou-hdp3.lz.dscc:37187]
14/11/18 14:28:17 INFO ExecutorLauncher: ApplicationAttemptId: 
appattempt_1415961020140_0325_000001
14/11/18 14:28:17 INFO ExecutorLauncher: Connecting to ResourceManager at 
longzhou-hdpnn.lz.dscc/192.168.19.107:12032
14/11/18 14:28:17 INFO ExecutorLauncher: Registering the ApplicationMaster
14/11/18 14:28:18 INFO ExecutorLauncher: Waiting for spark driver to be 
reachable.
14/11/18 14:28:18 INFO ExecutorLauncher: Master now available: 
192.168.59.90:36691
14/11/18 14:28:18 INFO ExecutorLauncher: Listen to driver: 
akka.tcp://spark@192.168.59.90:36691/user/CoarseGrainedScheduler
14/11/18 14:28:18 INFO ExecutorLauncher: Allocating 1 executors.
14/11/18 14:28:18 INFO YarnAllocationHandler: Allocating 1 executor containers 
with 1408 of memory each.
14/11/18 14:28:18 INFO YarnAllocationHandler: ResourceRequest (host : *, num 
containers: 1, priority = 1 , capability : memory: 1408)
14/11/18 14:28:18 INFO YarnAllocationHandler: Allocating 1 executor containers 
with 1408 of memory each.
14/11/18 14:28:18 INFO YarnAllocationHandler: ResourceRequest (host : *, num 
containers: 1, priority = 1 , capability : memory: 1408)
14/11/18 14:28:18 INFO RackResolver: Resolved longzhou-hdp3.lz.dscc to /rack1
14/11/18 14:28:18 INFO YarnAllocationHandler: launching container on 
container_1415961020140_0325_01_000002 host longzhou-hdp3.lz.dscc
14/11/18 14:28:18 INFO ExecutorRunnable: Starting Executor Container
14/11/18 14:28:18 INFO ExecutorRunnable: Connecting to ContainerManager at 
longzhou-hdp3.lz.dscc:12040
14/11/18 14:28:18 INFO ExecutorRunnable: Setting up ContainerLaunchContext
14/11/18 14:28:18 INFO ExecutorRunnable: Preparing Local resources
14/11/18 14:28:18 INFO ExecutorLauncher: All executors have launched.
14/11/18 14:28:18 INFO ExecutorLauncher: Started progress reporter thread - 
sleep time : 5000
14/11/18 14:28:18 INFO YarnAllocationHandler: ResourceRequest (host : *, num 
containers: 0, priority = 1 , capability : memory: 1408)
14/11/18 14:28:18 INFO ExecutorRunnable: Prepared Local resources 
Map(__spark__.jar -> resource {, scheme: "hdfs", host: 
"longzhou-hdpnn.lz.dscc", port: 11000, file: 
"/user/linqili/.sparkStaging/application_1415961020140_0325/spark-assembly-1.0.2-hadoop2.0.0-cdh4.2.1.jar",
 }, size: 134859131, timestamp: 1416292093988, type: FILE, visibility: PRIVATE, 
)
14/11/18 14:28:18 INFO ExecutorRunnable: Setting up executor with commands: 
List($JAVA_HOME/bin/java, -server, -XX:OnOutOfMemoryError='kill %p', -Xms1024m 
-Xmx1024m , 
-Djava.security.krb5.conf=/home/linqili/proc/spark_client/hadoop/kerberos5-client/etc/krb5.conf
 
-Djava.library.path=/home/linqili/proc/spark_client/hadoop/lib/native/Linux-amd64-64,
 -Djava.io.tmpdir=$PWD/tmp,  
-Dlog4j.configuration=log4j-spark-container.properties, 
org.apache.spark.executor.CoarseGrainedExecutorBackend, 
akka.tcp://spark@192.168.59.90:36691/user/CoarseGrainedScheduler, 1, 
longzhou-hdp3.lz.dscc, 3, 1>, <LOG_DIR>/stdout, 2>, <LOG_DIR>/stderr)
14/11/18 14:28:23 INFO YarnAllocationHandler: ResourceRequest (host : *, num 
containers: 0, priority = 1 , capability : memory: 1408)
14/11/18 14:28:23 INFO YarnAllocationHandler: Completed container 
container_1415961020140_0325_01_000002 (state: COMPLETE, exit status: 1)
14/11/18 14:28:23 INFO YarnAllocationHandler: Container marked as failed: 
container_1415961020140_0325_01_000002
14/11/18 14:28:28 INFO ExecutorLauncher: Allocating 1 containers to make up for 
(potentially ?) lost containers
14/11/18 14:28:28 INFO YarnAllocationHandler: Allocating 1 executor containers 
with 1408 of memory each.
14/11/18 14:28:28 INFO YarnAllocationHandler: ResourceRequest (host : *, num 
containers: 1, priority = 1 , capability : memory: 1408)
14/11/18 14:28:33 INFO ExecutorLauncher: Allocating 1 containers to make up for 
(potentially ?) lost containers
14/11/18 14:28:33 INFO YarnAllocationHandler: Allocating 1 executor containers 
with 1408 of memory each.
14/11/18 14:28:33 INFO YarnAllocationHandler: ResourceRequest (host : *, num 
containers: 1, priority = 1 , capability : memory: 1408)
14/11/18 14:28:33 INFO RackResolver: Resolved longzhou-hdp2.lz.dscc to /rack1
14/11/18 14:28:33 INFO YarnAllocationHandler: launching container on 
container_1415961020140_0325_01_000003 host longzhou-hdp2.lz.dscc
14/11/18 14:28:33 INFO ExecutorRunnable: Starting Executor Container
14/11/18 14:28:33 INFO ExecutorRunnable: Connecting to ContainerManager at 
longzhou-hdp2.lz.dscc:12040
14/11/18 14:28:33 INFO ExecutorRunnable: Setting up ContainerLaunchContext
14/11/18 14:28:33 INFO ExecutorRunnable: Preparing Local resources
14/11/18 14:28:33 INFO ExecutorRunnable: Prepared Local resources 
Map(__spark__.jar -> resource {, scheme: "hdfs", host: 
"longzhou-hdpnn.lz.dscc", port: 11000, file: 
"/user/linqili/.sparkStaging/application_1415961020140_0325/spark-assembly-1.0.2-hadoop2.0.0-cdh4.2.1.jar",
 }, size: 134859131, timestamp: 1416292093988, type: FILE, visibility: PRIVATE, 
)
14/11/18 14:28:33 INFO ExecutorRunnable: Setting up executor with commands: 
List($JAVA_HOME/bin/java, -server, -XX:OnOutOfMemoryError='kill %p', -Xms1024m 
-Xmx1024m , 
-Djava.security.krb5.conf=/home/linqili/proc/spark_client/hadoop/kerberos5-client/etc/krb5.conf
 
-Djava.library.path=/home/linqili/proc/spark_client/hadoop/lib/native/Linux-amd64-64,
 -Djava.io.tmpdir=$PWD/tmp,  
-Dlog4j.configuration=log4j-spark-container.properties, 
org.apache.spark.executor.CoarseGrainedExecutorBackend, 
akka.tcp://spark@192.168.59.90:36691/user/CoarseGrainedScheduler, 2, 
longzhou-hdp2.lz.dscc, 3, 1>, <LOG_DIR>/stdout, 2>, <LOG_DIR>/stderr)
14/11/18 14:28:38 INFO YarnAllocationHandler: ResourceRequest (host : *, num 
containers: 0, priority = 1 , capability : memory: 1408)
14/11/18 14:28:38 INFO YarnAllocationHandler: Ignoring container 
container_1415961020140_0325_01_000004 at host longzhou-hdp2.lz.dscc, since we 
already have the required number of containers for it.
14/11/18 14:28:38 INFO YarnAllocationHandler: Completed container 
container_1415961020140_0325_01_000003 (state: COMPLETE, exit status: 1)
14/11/18 14:28:38 INFO YarnAllocationHandler: Container marked as failed: 
container_1415961020140_0325_01_000003
14/11/18 14:28:43 INFO ExecutorLauncher: Allocating 1 containers to make up for 
(potentially ?) lost containers
14/11/18 14:28:43 INFO YarnAllocationHandler: Releasing 1 containers. 
pendingReleaseContainers : {container_1415961020140_0325_01_000004=true}
14/11/18 14:28:43 INFO YarnAllocationHandler: Allocating 1 executor containers 
with 1408 of memory each.
14/11/18 14:28:43 INFO YarnAllocationHandler: ResourceRequest (host : *, num 
containers: 1, priority = 1 , capability : memory: 1408)
14/11/18 14:28:48 INFO ExecutorLauncher: Allocating 1 containers to make up for 
(potentially ?) lost containers
14/11/18 14:28:48 INFO YarnAllocationHandler: Allocating 1 executor containers 
with 1408 of memory each.
14/11/18 14:28:48 INFO YarnAllocationHandler: ResourceRequest (host : *, num 
containers: 1, priority = 1 , capability : memory: 1408)
14/11/18 14:28:48 INFO YarnAllocationHandler: launching container on 
container_1415961020140_0325_01_000005 host longzhou-hdp2.lz.dscc
14/11/18 14:28:48 INFO ExecutorRunnable: Starting Executor Container
14/11/18 14:28:48 INFO ExecutorRunnable: Connecting to ContainerManager at 
longzhou-hdp2.lz.dscc:12040
14/11/18 14:28:48 INFO ExecutorRunnable: Setting up ContainerLaunchContext
14/11/18 14:28:48 INFO ExecutorRunnable: Preparing Local resources
14/11/18 14:28:48 INFO ExecutorRunnable: Prepared Local resources 
Map(__spark__.jar -> resource {, scheme: "hdfs", host: 
"longzhou-hdpnn.lz.dscc", port: 11000, file: 
"/user/linqili/.sparkStaging/application_1415961020140_0325/spark-assembly-1.0.2-hadoop2.0.0-cdh4.2.1.jar",
 }, size: 134859131, timestamp: 1416292093988, type: FILE, visibility: PRIVATE, 
)
14/11/18 14:28:48 INFO ExecutorRunnable: Setting up executor with commands: 
List($JAVA_HOME/bin/java, -server, -XX:OnOutOfMemoryError='kill %p', -Xms1024m 
-Xmx1024m , 
-Djava.security.krb5.conf=/home/linqili/proc/spark_client/hadoop/kerberos5-client/etc/krb5.conf
 
-Djava.library.path=/home/linqili/proc/spark_client/hadoop/lib/native/Linux-amd64-64,
 -Djava.io.tmpdir=$PWD/tmp,  
-Dlog4j.configuration=log4j-spark-container.properties, 
org.apache.spark.executor.CoarseGrainedExecutorBackend, 
akka.tcp://spark@192.168.59.90:36691/user/CoarseGrainedScheduler, 3, 
longzhou-hdp2.lz.dscc, 3, 1>, <LOG_DIR>/stdout, 2>, <LOG_DIR>/stderr)
14/11/18 14:28:53 INFO YarnAllocationHandler: ResourceRequest (host : *, num 
containers: 0, priority = 1 , capability : memory: 1408)
14/11/18 14:28:53 INFO YarnAllocationHandler: Ignoring container 
container_1415961020140_0325_01_000006 at host longzhou-hdp2.lz.dscc, since we 
already have the required number of containers for it.Is there any hint? 
Thanks.                                          

Reply via email to