ThomasDelteil edited a comment on issue #14357: [Bug] Batchnorm running_var behaves differently when using gpu vs. cpu URL: https://github.com/apache/incubator-mxnet/issues/14357#issuecomment-497487102 ```python for ctx in [mx.cpu(), mx.gpu()]: layer = gluon.nn.BatchNorm() layer.initialize(ctx=ctx) for i in range(100): data = mx.nd.random.normal(loc=10, scale=2, shape=(1,3,224,224), ctx=ctx) with autograd.record(): out = layer(data) print(ctx, layer.running_var.data().asnumpy(), layer.running_mean.data().asnumpy()) ``` ```text cpu(0) [1. 1. 1.] [0. 0. 0.] gpu(0) [4.010342 4.002672 3.9972866] [10.002233 9.998462 10.000072] ``` as you can see the variance and mean are erroneous on CPU, it is actually always ones for the variance, and 0 for the mean. Can someone with more knowledge of the operator API shed some light on this: https://github.com/apache/incubator-mxnet/blob/3f3ba92ae1468d08de088d2291ca14e2d5dc5515/src/operator/batch_norm_v1.cc#L101-L110 and see if this is expected? I see no equivalent for the GPU one.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
