ThomasDelteil edited a comment on issue #14357: [Bug] Batchnorm running_var behaves differently when using gpu vs. cpu URL: https://github.com/apache/incubator-mxnet/issues/14357#issuecomment-497487102 ```python for ctx in [mx.cpu(), mx.gpu()]: layer = gluon.nn.BatchNorm() layer.initialize(ctx=ctx) for i in range(100): data = mx.nd.random.normal(loc=10, scale=2, shape=(1,3,224,224), ctx=ctx) with autograd.record(): out = layer(data) print(ctx, layer.running_var.data().asnumpy(), layer.running_mean.data().asnumpy()) ``` ```text cpu(0) [1. 1. 1.] [0. 0. 0.] gpu(0) [4.010342 4.002672 3.9972866] [10.002233 9.998462 10.000072] ``` as you can see the variance and mean are erroneous on CPU, it is actually always ones for the variance, and 0 for the mean. edit: it seems that the running_mean and running_var are computed in the forward on GPU but not on CPU. ```python for ctx in [mx.cpu(), mx.gpu()]: layer = gluon.nn.BatchNorm() layer.initialize(ctx=ctx) for i in range(100): data = mx.nd.random.normal(loc=10, scale=2, shape=(1,3,224,224), ctx=ctx) with autograd.record(): out = layer(data) out.backward() print(ctx, layer.running_var.data().asnumpy(), layer.running_mean.data().asnumpy()) ``` cpu(0) [3.9925063 4.008966 3.9975138] [10.000259 10.001338 9.999197] gpu(0) [3.9989533 3.995771 4.009656 ] [10.001726 9.998586 9.999382] ```
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
