KellenSunderland commented on issue #14725: Performance Regression on CUDA10 URL: https://github.com/apache/incubator-mxnet/issues/14725#issuecomment-485494410 Good investigation so far, your explanation makes sense @anirudh2290 and @stu1130. I think that's enough information to get NVIDIA started on verifying the regression and looking for root causes and a fix. If we wanted to investigate further there is one more step we could potentially run to really lock down exactly why this kernel regressed. I've found API logging from cuDNN and cuBLAS to be quite useful. If you want to take a look at them, we could re-run without nvprof and add the env vars: ```bash export CUBLAS_LOGINFO_DBG=1 export CUBLAS_LOGDEST_DBG=/tmp/cublas_api_logs.txt ``` Then grep /tmp/cublas_api_logs.txt for volta_sgemm_128x64_nt calls, and see how many variants of parameters we're calling that kernel with there are. We could then just create a minimal reproducible call to cublas that uses those params (hopefully there aren't many of them) and show the arguments for which the cublas lib regresses.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
