eric-haibin-lin commented on issue #10366: fix bug in sgd
URL: https://github.com/apache/incubator-mxnet/pull/10366#issuecomment-383655298
Thanks! Just curious - why were you not using device kvstore? Is it taking
too much memory?
Also, what's the batch size you're using?
eric-haibin-lin commented on issue #10366: fix bug in sgd
URL: https://github.com/apache/incubator-mxnet/pull/10366#issuecomment-383462666
@solin319 how much performance difference did you see if you remove temp
resource?
eric-haibin-lin commented on issue #10366: fix bug in sgd
URL: https://github.com/apache/incubator-mxnet/pull/10366#issuecomment-380550452
Tracked in #10509
This is an automated message from the Apache Git Service.
To
eric-haibin-lin commented on issue #10366: fix bug in sgd
URL: https://github.com/apache/incubator-mxnet/pull/10366#issuecomment-380550452
Tracked in #10366
This is an automated message from the Apache Git Service.
To
eric-haibin-lin commented on issue #10366: fix bug in sgd
URL: https://github.com/apache/incubator-mxnet/pull/10366#issuecomment-379610436
Are you training with multi-machine and multi-GPU? What type of kvstore are
you using?
I guess one walk-around is to update the `FResourceRequest `
eric-haibin-lin commented on issue #10366: fix bug in sgd
URL: https://github.com/apache/incubator-mxnet/pull/10366#issuecomment-378062143
Would setting MXNET_EXEC_NUM_TEMP help? @solin319
https://github.com/apache/incubator-mxnet/blob/master/docs/faq/env_var.md#memory-options