CUDA / CUDNN support revisited

Dick Carter Mon, 03 Jun 2019 15:06:33 -0700

I'd like to revisit the discussion of: 
https://lists.apache.org/thread.html/27b84e4fc0e0728f2e4ad8b6827d7f996635021a5a4d47b5d3f4dbfb@%3Cdev.mxnet.apache.org%3E
 now that a year has passed.


My motivation is:

1.  There's a lot of hard-to-read  '#if CUDNN_MAJOR' code referencing cuDNN 
versions back as far as v4(!?).  We need to clean this out before it hampers 
our ability to nimbly move the codebase forward.

2.  There seems to be a difference of opinion on whether we should be 
supporting version 'N-1' (e.g. cuDNN6).  Our current MXNet 1.5 candidate does 
not compile against cuDNN v6, so this should be either fixed or be up-front 
stated to the user community.  The breaking PR was 
https://github.com/apache/incubator-mxnet/pull/14476.

Having read the prior discussion, my take on it is:

- Users should be given an ample time period (1 year?) to move to a new 
CUDA/cuDNN version once it becomes 'usable.'

- We should not claim to support a given version if it is no longer part of the 
MXNet CI.  User's should be warned of an impeding dropping of this 'testing 
support.'

So these statements do not necessarily promise 'N-1' support.  I could see a 
transitioning of the CI from CUDA9-only -> CUDA9&10 -> CUDA10 only.  Some 
period before CUDA9 is dropped from CI, the user community is warned.  After 
that time, CUDA10 might be the only version tested by CI, and hence the only 
version supported (until the next CUDA version came around).

Let me propose as a 'strawman' that we claim to support CUDA version 9 and 10, 
with cuDNN version 7 only.  Those versions have been out for over 1.5 years.  
So no CUDA 8 or cuDNN v6 support- over 1.5 years old with no coverage by our CI.

    -Dick

CUDA / CUDNN support revisited

Reply via email to