Jeffery4000 opened a new issue #10070: Distributed Training (Permission denied)
URL: https://github.com/apache/incubator-mxnet/issues/10070
 
 
   Hi, I am trying to run distributed training the example provided on 2 
different node with the following internal IP address: 
   
   ```
   user1@111.111.111.121
   user2@111.111.111.122
   ```
   
   I created a hosts file with the following ip, and ssh got no issue at all 
from one to other machine. When i launch the code:
   
   ```
   python ../../tools/launch.py -n 2 --launcher ssh -H hosts python 
train_mnist.py --network lenet --kv-store dist_device_sync
   ```
   
   And it prompt the following output at the same time:
   ```
   user1@111.111.111.121's password: user1@111.111.111.121's password: 
user2@111.111.111.122's password: user2@111.111.111.122's password:
   ```
   
   For both machine I'm using the same admin password, no matter how hard I try 
it just prom Permission denied, please try again.
   
   It's there any way I can get debug message on what really happening behind 
the background?
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to