sandeep-krishnamurthy commented on issue #11011: ../../tools/launch.py -n 2 -s 
2 --launcher yarn  python train_mnist.py --network lenet --kv-store dist_sync
URL: 
https://github.com/apache/incubator-mxnet/issues/11011#issuecomment-391759054
 
 
   Hello @liuzx32 - Here is an example of using distributed training with Yarn 
- https://dzone.com/articles/running-mxnet-on-hadoop-yarn
   
   If you are open with SSH based distributed training. Here is a very good 
example using AWS CloudFormation template - 
https://github.com/awslabs/deeplearning-cfn#running-distributed-training-on-mxnet
   
   Also, it would be great if you could contribute a tutorial for using MXNet 
with Yarn.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to