pengzhao-intel commented on issue #14173: [WIP] MXNet AMP (automatic mixed precision) URL: https://github.com/apache/incubator-mxnet/pull/14173#issuecomment-476435228 > We tested all (except for the newest pose estimation) of the networks from GluonCV and for all of them we saw between few to over 100% speedups running on V100, with typical speedup being over 50%, depending on what the relative cost of convolution/gemms are compared to the rest of the network. @ptrendx TensorCore with FP16 provides the 8X speedup against FP32. What is the gap between 2X in MXNet with 8X from TensorCore?
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
