liuzx32 opened a new issue #10984: Some wrong with training on yarn!?
URL: https://github.com/apache/incubator-mxnet/issues/10984
 
 
   ../../tools/launch.py -n 2 -s 2 --cluster=local python train_mnist.py 
--network lenet --kv-store dist_sync, run successful.
   So I adjust --cluster=yarn to mxnet on yarn
   ../../tools/launch.py -n 2 -s 2 --cluster=yarn python train_mnist.py 
--network lenet --kv-store dist_sync 
   ##
   ##
   But job failed Exit code: 1
   18/05/16 12:55:16 INFO dmlc.ApplicationMaster: onContainerStarted Invoked
   18/05/16 12:55:16 INFO dmlc.ApplicationMaster: onContainerStarted Invoked
   18/05/16 12:55:17 INFO dmlc.ApplicationMaster: [DMLC] Task 0 exited with 
status 1 Diagnostics:Exception from container-launch.
   Container id: container_1498108406715_361607_01_000002
   Exit code: 1
   Stack trace: ExitCodeException exitCode=1: 
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:582)
        at org.apache.hadoop.util.Shell.run(Shell.java:479)
        at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:773)
        at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
   
   
   Container exited with a non-zero exit code 1

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to