[GitHub] solin319 commented on issue #8373: distribute training in fp16

2018-02-23 Thread GitBox
solin319 commented on issue #8373: distribute training in fp16 URL: https://github.com/apache/incubator-mxnet/pull/8373#issuecomment-368188760 @rahul003 For alexnet, try to use fp16 with GPU in kvstore_dist_server. For resnet, try to use dist_sync.

[GitHub] solin319 commented on issue #8373: distribute training in fp16

2018-02-23 Thread GitBox
solin319 commented on issue #8373: distribute training in fp16 URL: https://github.com/apache/incubator-mxnet/pull/8373#issuecomment-368185953 Witch data type were used in training? We ues fp16 in training computation. @rahul003

[GitHub] solin319 commented on issue #8373: distribute training in fp16

2017-10-30 Thread GitBox
solin319 commented on issue #8373: distribute training in fp16 URL: https://github.com/apache/incubator-mxnet/pull/8373#issuecomment-340633802 1. In current way, I think the class kvstore_dist is just like class DistServerWrapper mentioned above. We pass the data type to kvstore_dist

[GitHub] solin319 commented on issue #8373: distribute training in fp16

2017-10-30 Thread GitBox
solin319 commented on issue #8373: distribute training in fp16 URL: https://github.com/apache/incubator-mxnet/pull/8373#issuecomment-340633802 1. In current way, I think the class kvstore_dist is just like class DistServerWrapper mentioned above. We pass the data type to kvstore_dist

[GitHub] solin319 commented on issue #8373: distribute training in fp16

2017-10-30 Thread GitBox
solin319 commented on issue #8373: distribute training in fp16 URL: https://github.com/apache/incubator-mxnet/pull/8373#issuecomment-340633802 1. In current way, I think the class kvstore_dist is just like class DistServerWrapper mentioned above. We pass the data type to kvstore_dist

[GitHub] solin319 commented on issue #8373: distribute training in fp16

2017-10-30 Thread GitBox
solin319 commented on issue #8373: distribute training in fp16 URL: https://github.com/apache/incubator-mxnet/pull/8373#issuecomment-340627932 Yes, all keys are used in fp16. Because ps_worker_ used in program was defined with template argument. I think it's hard to define two different