KellenSunderland commented on issue #14725: Performance Regression on CUDA10
URL: 
https://github.com/apache/incubator-mxnet/issues/14725#issuecomment-485494410
 
 
   Good investigation so far, your explanation makes sense @anirudh2290 and 
@stu1130.  I think that's enough information to get NVIDIA started on verifying 
the regression and looking for root causes and a fix.
   
   If we wanted to investigate further there is one more step we could 
potentially run to really lock down exactly why this kernel regressed.  I've 
found API logging from cuDNN and cuBLAS to be quite useful.  If you want to 
take a look at them, we could re-run without nvprof and add the env vars:
   
   ```bash
   export CUBLAS_LOGINFO_DBG=1
   export CUBLAS_LOGDEST_DBG=/tmp/cublas_api_logs.txt
   ```
   
   Then grep /tmp/cublas_api_logs.txt for volta_sgemm_128x64_nt calls, and see 
how many variants of parameters we're calling that kernel with there are.  We 
could then just create a minimal reproducible call to cublas that uses those 
params (hopefully there aren't many of them) and show the arguments for which 
the cublas lib regresses.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to