This PR adds a new environment variable `MXNET_CUDA_TENSOR_OP_MATH_ALLOW_CONVERSION` that lets cuDNN Convolution + RNN use the new (since cuDNN v7.2) support for using [Tensor Cores with Float32 inputs](https://devblogs.nvidia.com/tensor-ops-made-easier-in-cudnn/). It resolves issue #12463, and makes it much easier for users to do a form of mixed-precision training.
@szha [ Full content available at: https://github.com/apache/incubator-mxnet/pull/12722 ] This message was relayed via gitbox.apache.org for [email protected]
