CanoeFZH edited a comment on issue #11011: ../../tools/launch.py -n 2 -s 2 --launcher yarn python train_mnist.py --network lenet --kv-store dist_sync URL: https://github.com/apache/incubator-mxnet/issues/11011#issuecomment-399023623 get same errror here: ```LogType:stderr Log Upload Time:Thu Jun 21 07:11:09 +0000 2018 LogLength:523 Log Contents: Traceback (most recent call last): File "./launcher.py", line 85, in <module> main() File "./launcher.py", line 80, in main ret = subprocess.call(args=sys.argv[1:], env=env) File "/usr/lib64/python2.7/subprocess.py", line 168, in call return Popen(*popenargs, **kwargs).wait() File "/usr/lib64/python2.7/subprocess.py", line 390, in __init__ errread, errwrite) File "/usr/lib64/python2.7/subprocess.py", line 1024, in _execute_child raise child_exception OSError: [Errno 13] Permission denied End of LogType:stderr LogType:stdout Log Upload Time:Thu Jun 21 07:11:09 +0000 2018 LogLength:0 Log Contents: End of LogType:stdout ``` my submit script: ```#!/bin/bash worker=10 server=5 vcore=3 export HADOOP_HOME=/usr/ export HADOOP_HDFS_HOME=/usr/lib/hadoop-hdfs export hdfs_home=/usr/lib/hadoop-hdfs export hadoop_hdfs_home=/usr/lib/hadoop-hdfs export DMLC_JOB_CLUSTER='127.0.0.1' export DMLC_CPU_VCORES=2 export DMLC_MEMORY_MB=4096 export PS_VERBOSE=2 ./tools/launch.py -s $server -n $worker --launcher yarn --sync-dst-dir /tmp/mxnet_job/ python example/gluon//image_classification.py --dataset cifar10 --model vgg11 --epochs 1 --kvstore dist_sync ``` I am sure that this permission denied happens at task nodes of yarn.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
