DickJC123 commented on issue #18800:
URL: 
https://github.com/apache/incubator-mxnet/issues/18800#issuecomment-668969957


   A major feature of CUDA 11 and cuDNN 8.0 is support for the new A100 GPU and 
its TensorFloat-32 (TF32) mode of computation.  I would like to include PR 
https://github.com/apache/incubator-mxnet/pull/18694, "Unittest tolerance 
handling improvements", which allows MXNet to use TF32 effectively.  The PR 
also makes sensible adjustments to the unittest tolerances based on device 
context and dtype, ensuring A100 compatibility with our unittest suite.
   
   With cuDNN 8.0 also comes compatibility with CUDA Graph Capture- I would 
like to include a PR (near complete, but not yet submitted) that enables CUDA 
Graph use.  This will permit MXNet to bypass much of the CPU preparation for 
launching identical kernel sequences, as are commonly seen in many deep 
learning training and inferencing environments.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to