TristonC edited a comment on issue #18751:
URL: 
https://github.com/apache/incubator-mxnet/issues/18751#issuecomment-662687910


   @gilbertfrancois I did a quick test, to answer your question:
   > I don't understand why y_out from MyNet with BatchNorm on GPU still 
contains real numbers, given that the layer before outputs NaNs?
   
   The y_out and y_embedding actually came from two separate forward runs. And 
y_embedding came at the second run. The first run was done with training 
mode(with autograd.record(train_mode=True), and the second run was done without 
recording (not training). The first run should get both y_out and y_embeddings 
correctly. Unfortunately, the first run  got the first BatchNorm's running var 
in tail into NaN, so the second run will get NaN for both y_embedding and y_out 
(did not print out). I have not figured out why the NaN's happened in tail 
only, as there are also many more BN in the features (and seem those are OK). 
Will dig more. BTW, if you print your embedding before y_out, you will get both 
without NaN's. Non-training mode will get your answer correct.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to