grygielski commented on issue #19218:
URL: 
https://github.com/apache/incubator-mxnet/issues/19218#issuecomment-710124169


   @buaalsy2003 Sorry for my late response but I somehow missed your question.
   If it comes to how I figured out this problem, I had some experience with 
similar behavior from other frameworks so denormal values was my initial guess. 
I didn't use any sophisticated debugging tool, to confirm that, I just checked 
for denormals inside C++ code (more precisely with `fpclassify` function). I've 
added these checks on convolution input values and built MXNet from source.
   
   However, the first step for me is always checking an output of running MXNet 
code with `export MKLDNN_VERBOSE=1` environment variable. It outputs oneDNN 
(MKL-DNN) primitives executed in order with the execution time at the end. This 
way I can compare 2 runs like in this case and see if any of them differ 
significantly.
   
   I hope it somehow shed a light on my thought process and can help you in the 
future.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to