TristonC edited a comment on issue #18751: URL: https://github.com/apache/incubator-mxnet/issues/18751#issuecomment-662687910
@gilbertfrancois I did a quick test, to answer your question: > I don't understand why y_out from MyNet with BatchNorm on GPU still contains real numbers, given that the layer before outputs NaNs? The y_out and y_embedding actually came two separate forward runs. And y_embedding came at the second run. The first run was done with training mode( with autograd.record(train_mode=True), and the second run was done without recording. The first run should get both y_out and y_embeddings correctly. Unfortunately, the first run got the first BatchNorm's running var in tail into NaN, so the second run will get NaN for both y_embedding and y_out (did not print out). I have not figured out why the NaN's happened in tail only, as there are also many more BN in the features. Will dig more. BTW, if you print your embedding before y_out, you will get both without NaN's. Non-training mode will get your answer correct. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org