TristonC edited a comment on issue #18751:
URL: 
https://github.com/apache/incubator-mxnet/issues/18751#issuecomment-662687910


   @gilbertfrancois I did a quick test, to answer your question:
   > I don't understand why y_out from MyNet with BatchNorm on GPU still 
contains real numbers, given that the layer before outputs NaNs?
   
   The y_out and y_embedding actually came two separate forward runs. And 
y_embedding came at the second run. The first run was done with training mode( 
with autograd.record(train_mode=True), and the second run was done without 
recording. The first run should get both y_out and y_embeddings correctly. 
Unfortunately, the first run  got the first BatchNorm's running var in tail 
into NaN, so the second run will get NaN for both y_embedding and y_out (did 
not print out). I have not figured out why the NaN's happened in tail only, as 
there are also many more BN in the features. Will dig more. BTW, if you print 
your embedding before y_out, you will get both without NaN's. Non-training mode 
will get your answer correct.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to