Thanks for your input. How would you propose to proceed in terms of a timeline in case this vote succeedes? I don't really have time to work on a nightly setup right now. Would anybody in the community be able to help me out here or shall we wait with the migration until a nightly setup for CUDA 8 is up?
-Marco On Fri, Mar 16, 2018 at 9:55 PM, Bhavin Thaker <[email protected]> wrote: > +1 to the suggestion of testing CUDA8 in few nightly instances and using > CUDA9 for most instances in CI. > > Bhavin Thaker. > > On Fri, Mar 16, 2018 at 12:37 PM Naveen Swamy <[email protected]> wrote: > > > I think its best to add support for CUDA 9.0 while retaining existing > > support for CUDA 8, code might regress when you remove and create more > work > > to add CUDA 8 support back. > > > > On Fri, Mar 16, 2018 at 9:29 AM, Marco de Abreu < > > [email protected]> wrote: > > > > > Yeah, sorry Chris, mixed up the names. > > > > > > @Naveen: Would you be fine with doing the switch now and adding > > integration > > > tests later or is this a hard constraint for you? > > > > > > On Wed, Mar 14, 2018 at 6:39 PM, Chris Olivier <[email protected]> > > > wrote: > > > > > > > Isn't the TItan V the Volta and not the Tesla? > > > > > > > > On Wed, Mar 14, 2018 at 10:36 AM, Naveen Swamy <[email protected]> > > > wrote: > > > > > > > > > Marco, > > > > > My -1 vote is for dropping support to CUDA 8 and not for adding > CUDA > > 9. > > > > > CUDA 9.0 support for MXNet was added Oct'30-2017, I think that all > > > users > > > > > might not have switched to CUDA 9.0 > > > > > > > > > > Look at the earlier discussion on the same topic > > > > > > > > > > https://lists.apache.org/thread.html/ > 27b84e4fc0e0728f2e4ad8b6827d7f > > > > > 996635021a5a4d47b5d3f4dbfb@%3Cdev.mxnet.apache.org%3E > > > > > > > > > > On Wed, Mar 14, 2018 at 10:14 AM, Marco de Abreu < > > > > > [email protected]> wrote: > > > > > > > > > > > Right, the code changes would not be validated against CUDA 8.0 > as > > > part > > > > > of > > > > > > the PR process. > > > > > > > > > > > > I don't have any numbers, but it's pretty unlikely that anybody > is > > > > still > > > > > > using CUDA 8.0. According to > > > > > > https://en.wikipedia.org/wiki/CUDA#GPUs_supported, the devices > > which > > > > are > > > > > > not being supported by CUDA 9 are under the Fermi architecture > > which > > > > has > > > > > > been released in April 2010. These GPUs are way too old, so I > think > > > > we're > > > > > > safe with not covering them specifically - this does not mean > we're > > > > > > entirely deprecating them. > > > > > > > > > > > > One thing to note here is that we're not testing CUDA 9 as of > now. > > > > > > Considering that the Telsa architecture (Titan V, V100) requires > at > > > > least > > > > > > CUDA 9 and those are probably the most widely used GPUs for Deep > > > > > Learning, > > > > > > we'd probably be covering a wider user-base in comparison to > CUDA 8 > > > if > > > > we > > > > > > make that switch. > > > > > > > > > > > > -Marco > > > > > > > > > > > > On Wed, Mar 14, 2018 at 5:59 PM, Naveen Swamy < > [email protected]> > > > > > wrote: > > > > > > > > > > > > > Does this mean that MXNet Users who use CUDA 8.0 will not be > > > > > > > supported(since you are stopping to test CUDA 8.0) ? I suggest > we > > > at > > > > > > least > > > > > > > have nightly tests for CUDA 8.0. > > > > > > > > > > > > > > Do you have a sense of how many users are using CUDA 8.0/9.0 ? > > > > > > > > > > > > > > -1 > > > > > > > > > > > > > > > > > > > > > On Wed, Mar 14, 2018 at 9:50 AM, Chris Olivier < > > > > [email protected]> > > > > > > > wrote: > > > > > > > > > > > > > > > +0 > > > > > > > > > > > > > > > > On Wed, Mar 14, 2018 at 9:45 AM, Jin, Hao <[email protected]> > > > wrote: > > > > > > > > > > > > > > > > > +1 > > > > > > > > > > > > > > > > > > On 3/14/18, 9:04 AM, "Anirudh" <[email protected]> > > wrote: > > > > > > > > > > > > > > > > > > +1 > > > > > > > > > > > > > > > > > > On Mar 14, 2018 8:56 AM, "Wu, Jun" <[email protected]> > > > wrote: > > > > > > > > > > > > > > > > > > > +1 > > > > > > > > > > > > > > > > > > > > On 3/14/18, 8:52 AM, "Marco de Abreu" < > > > > > > > > [email protected]> > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > Hello, > > > > > > > > > > > > > > > > > > > > this is a vote to upgrade our CI environment from > > the > > > > > > current > > > > > > > > > CUDA 8.0 > > > > > > > > > > with > > > > > > > > > > CuDNN 5.0 to CUDA 9.1 with CuDNN 7.0. Reason > being > > > that > > > > > > NVCC > > > > > > > > > under > > > > > > > > > > CUDA 8 > > > > > > > > > > does not support the Volta GPUs used in AWS P3 > > > > instances > > > > > > and > > > > > > > > thus > > > > > > > > > > limiting > > > > > > > > > > our test capabilities. More details are available > > at > > > > [1]. > > > > > > > > > > > > > > > > > > > > In order to introduce support for Quantization > [1], > > > I'd > > > > > > like > > > > > > > to > > > > > > > > > > perform a > > > > > > > > > > system-wide upgrade. This should have no negative > > > > impact > > > > > in > > > > > > > our > > > > > > > > > users > > > > > > > > > > but > > > > > > > > > > rather makes sure that we're actually testing > with > > > the > > > > > > latest > > > > > > > > > > versions. The > > > > > > > > > > PR is available at [3]. > > > > > > > > > > > > > > > > > > > > This means that we would stop verifying CUDA 8 > and > > > > CuDNN > > > > > > 5.0 > > > > > > > as > > > > > > > > > part > > > > > > > > > > of our > > > > > > > > > > PR process. At a later point in time, this could > be > > > > > picked > > > > > > up > > > > > > > > as > > > > > > > > > a > > > > > > > > > > candidate for an integration test as part of the > > > > nightly > > > > > > > suite. > > > > > > > > > > > > > > > > > > > > This is a lazy vote, ending on 17th of March, > 2018 > > at > > > > > 17:00 > > > > > > > > (UTC > > > > > > > > > +1). > > > > > > > > > > > > > > > > > > > > Best regards, > > > > > > > > > > Marco > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [1]: > > https://issues.apache.org/jira/browse/MXNET-99 > > > > > > > > > > [2]: https://github.com/apache/ > > > > incubator-mxnet/pull/9552 > > > > > > > > > > [3]: https://github.com/apache/ > > > > > incubator-mxnet/pull/10108 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
