[GitHub] solin319 commented on issue #10366: fix bug in sgd

2018-04-23 Thread GitBox
solin319 commented on issue #10366: fix bug in sgd URL: https://github.com/apache/incubator-mxnet/pull/10366#issuecomment-383771962 Batch-size=128 Use device kvstore, the performance almost same, both about 110 samples/sec.

[GitHub] solin319 commented on issue #10366: fix bug in sgd

2018-04-23 Thread GitBox
solin319 commented on issue #10366: fix bug in sgd URL: https://github.com/apache/incubator-mxnet/pull/10366#issuecomment-383478020 @eric-haibin-lin gpus=2*k80 network=vgg16 data=imagenet kv-store=local The performance is 131samples/sec when we remove temp resource. If

[GitHub] solin319 commented on issue #10366: fix bug in sgd

2018-04-23 Thread GitBox
solin319 commented on issue #10366: fix bug in sgd URL: https://github.com/apache/incubator-mxnet/pull/10366#issuecomment-383478020 @eric-haibin-lin gpus=2*k80 network=vgg16 data=imagenet The performance is 131samples/sec when we remove temp resource. If not the

[GitHub] solin319 commented on issue #10366: fix bug in sgd

2018-04-08 Thread GitBox
solin319 commented on issue #10366: fix bug in sgd URL: https://github.com/apache/incubator-mxnet/pull/10366#issuecomment-379620820 The results above was get in multi-GPU training with kv_store='local'. The same problem was in kv_store='device' too. When we training in multi-machine,

[GitHub] solin319 commented on issue #10366: fix bug in sgd

2018-04-03 Thread GitBox
solin319 commented on issue #10366: fix bug in sgd URL: https://github.com/apache/incubator-mxnet/pull/10366#issuecomment-378185953 set MXNET_CPU_TEMP_COPY = 100 When training resnet-50, the sgd_mom_update still can't start directly after fist backward computation.

[GitHub] solin319 commented on issue #10366: fix bug in sgd

2018-04-02 Thread GitBox
solin319 commented on issue #10366: fix bug in sgd URL: https://github.com/apache/incubator-mxnet/pull/10366#issuecomment-378098403 @eric-haibin-lin MXNET_EXEC_NUM_TEMP doesn't work. But make MXNET_CPU_TEMP_COPY and MXNET_GPU_TEMP_COPY larger can solve the overlap problem. It's