ptrendx commented on issue #16845: MXNet 1.6.0 performance regression URL: https://github.com/apache/incubator-mxnet/issues/16845#issuecomment-555283280 Ok, so I looked into it and I can kind of see 1.6 being slower, but on the other hand this script is really not a great way of testing performance of the GPU training. Because the kernels are tiny, it is actually dominated by gaps in execution while CPU is trying to launch the kernels (and the line to run it you gave does not even use hybridization to offset it in any way, enabling hybridization improves performance by ~2x). Looking at the GPU kernel time I do not see any real difference, so the slowdown is most probably due to increase in time spent actually launching the ops.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services