@szha: I'm happy to make a PR. One question about the behaviour: the flag 
`MXNET_CUDA_ALLOW_TENSOR_CORE` is set to `true` by default. Its behaviour is:
- Only use TensorCores if `DType` is `float16` and 
`MXNET_CUDA_ALLOW_TENSOR_CORE` is `true`. Presumably, the use of TensorCores 
will never worsen `float16` training, so defaulting to TensorCore use seems 
reasonable.
- But if `DType` is `float32`, then we can't assume the user wants to use 
TensorCores. But they do want a way of opting-in (and setting 
`CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION` when available). 

So how about an environment variable 
`MXNET_CUDA_TENSOR_OP_MATH_ALLOW_CONVERSION`, which defaults to `false`, and if 
`true`, will let `float32` or `float64` nets use TensorCores by implicit 
downcasting?

[ Full content available at: 
https://github.com/apache/incubator-mxnet/issues/9543 ]
This message was relayed via gitbox.apache.org for [email protected]

Reply via email to