TristonC edited a comment on issue #18751:
URL: 
https://github.com/apache/incubator-mxnet/issues/18751#issuecomment-662687910


   @gilbertfrancois I did a quick test, to answer your question:
   > I don't understand why y_out from MyNet with BatchNorm on GPU still 
contains real numbers, given that the layer before outputs NaNs?
   
   The logged y_out and y_embedding actually came from two separate forward 
runs. And y_embedding came at the second run. The first run was done with 
training mode(with autograd.record(train_mode=True), and the second run was 
done without recording (not training). The first run should get both y_out and 
y_embeddings correctly. Unfortunately, the first run  got the first BatchNorm's 
running var in tail into NaN, so the second run will get NaN for both 
y_embedding and y_out (did not print out). I have not figured out why the NaN's 
happened in tail only, as there are also many more BN in the features (and seem 
those are OK). Will dig more. BTW, if you print the embedding before y_out, you 
will get both without NaN's. Non-training mode will get your answer correct.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to