Re: CUDA Support [DISCUSS]

Bhavin Thaker Sat, 06 Jan 2018 09:49:50 -0800

Hi Marco,

Here are the Years in which the GPU architectures were introduced:


   - Tesla: 2008;
   - Fermi: 2010;
   - Kepler: 2012;
   - Maxwell: 2014;
   - Pascal:2016;
   - Volta: 2017;

I see no need to support the 7+ year old Fermi architecture for fast-moving
Apache MXNet.

Bhavin Thaker.

On Sat, Jan 6, 2018 at 9:36 AM Marco de Abreu <[email protected]>
wrote:

> Just to provide some data. Dropping CUDA8 support would deprecate the
> Fermi-Architecture, effectively affecting the following devices:
>
> 2.0 Fermi <https://en.wikipedia.org/wiki/Fermi_(microarchitecture)> GF100,
> GF110 GeForce GTX 590, GeForce GTX 580, GeForce GTX 570, GeForce GTX 480,
> GeForce GTX 470, GeForce GTX 465, GeForce GTX 480M Quadro 6000, Quadro
> 5000, Quadro 4000, Quadro 4000 for Mac, Quadro Plex 7000, Quadro 5010M,
> Quadro 5000M Tesla C2075, Tesla C2050/C2070, Tesla M2050/M2070/M2075/M2090
> 2.1 GF104, GF106 GF108, GF114, GF116, GF117, GF119 GeForce GTX 560 Ti,
> GeForce GTX 550 Ti, GeForce GTX 460, GeForce GTS 450, GeForce GTS 450*,
> GeForce GT 640 (GDDR3), GeForce GT 630, GeForce GT 620, GeForce GT 610,
> GeForce GT 520, GeForce GT 440, GeForce GT 440*, GeForce GT 430, GeForce GT
> 430*, GeForce GT 420*,
> GeForce GTX 675M, GeForce GTX 670M, GeForce GT 635M, GeForce GT 630M,
> GeForce GT 625M, GeForce GT 720M, GeForce GT 620M, GeForce 710M, GeForce
> 610M, GeForce 820M, GeForce GTX 580M, GeForce GTX 570M, GeForce GTX 560M,
> GeForce GT 555M, GeForce GT 550M, GeForce GT 540M, GeForce GT 525M, GeForce
> GT 520MX, GeForce GT 520M, GeForce GTX 485M, GeForce GTX 470M, GeForce GTX
> 460M, GeForce GT 445M, GeForce GT 435M, GeForce GT 420M, GeForce GT 415M,
> GeForce 710M, GeForce 410M Quadro 2000, Quadro 2000D, Quadro 600, Quadro
> 4000M, Quadro 3000M, Quadro 2000M, Quadro 1000M, NVS 310, NVS 315, NVS
> 5400M, NVS 5200M, NVS 4200M
>
> -Marco
>
> On Sat, Jan 6, 2018 at 6:31 PM, kellen sunderland <
> [email protected]> wrote:
>
> > I like that proposal Bhavin.  I'm also interested to see what the other
> > community members think.
> >
> > On Sat, Jan 6, 2018 at 6:27 PM, Bhavin Thaker <[email protected]>
> > wrote:
> >
> > > Hi Kellen,
> > >
> > > Here is my opinion and stand on this:
> > >
> > > I see no need to test on CUDA8 in Apache MXNet CI, especially when
> CUDA9
> > is
> > > backward compatible with earlier Nvidia hardware generations. There is
> > time
> > > and resources cost to maintaining the various combinations in the CI
> and
> > so
> > > I am NOT in favor of running CUDA8 in CI unless there is a technical
> > > reason/requirement for it. This approach helps to encourage users to
> move
> > > to the latest CUDA version and thus keep the open-source community’s
> > > maintenance cost low for the generic option of CUDA9.
> > >
> > > For example: If a user opens a github issue/problem with Apache MXNet
> and
> > > CUDA8, I would ask the user to test it with CUDA9. If the problem
> happens
> > > only on CUDA8, then a volunteer in the community may work on it. If the
> > > problem happens on CUDA9 as well, then, in my humble opinion, and this
> > > problem must be fixed by the community. In short, I propose that the
> > MXNet
> > > CI run tests only with latest CUDA9 version and NOT CUDA8.
> > >
> > > I am eager to hear alternate viewpoints/corrections from folks other
> than
> > > Kellen and me.
> > >
> > > Bhavin Thaker.
> > >
> > > On Sat, Jan 6, 2018 at 8:24 AM kellen sunderland <
> > > [email protected]> wrote:
> > >
> > > > Thanks for the thoughts Bhavin, supporting the latest release would
> > also
> > > be
> > > > an option, and it would be easier from a support point of view.
> > > >
> > > > "2) I think your question probably is what should be tested by the
> > Apache
> > > > MXNet CI and NOT what is supported by Apache MXNet, correct?"
> > > >
> > > > I view these two things as being closely related, if not equivalent.
> > If
> > > we
> > > > don't run at least basic tests of old versions of CUDA I think there
> > will
> > > > be issues that slip through.  That being said we can rely on users to
> > > > report these issues, and chances are we'll be able to provide
> backwards
> > > > compatible patches.  At a minimum I'd recommend we should run tests
> on
> > > all
> > > > supported CUDA versions before a release.
> > > >
> > > > -Kellen
> > > >
> > > >
> > > > On Sat, Jan 6, 2018 at 5:05 PM, Bhavin Thaker <
> [email protected]>
> > > > wrote:
> > > >
> > > > > Hi Kellen,
> > > > >
> > > > > 1) Does Apache MXNet (Incubating) have a support matrix? I think
> the
> > > > answer
> > > > > is no, because I don’t know of where it is documented. One of the
> > > mentors
> > > > > told me earlier that the community uses and modifies the
> open-source
> > > > > project as per their individual  requirements or those of the
> > > community.
> > > > As
> > > > > far as I know, there is no single entity that is responsible for
> > > > supporting
> > > > > something in MXNet — corrections to my understanding are welcome.
> > > > >
> > > > > 2) I think your question probably is what should be tested by the
> > > Apache
> > > > > MXNet CI and NOT what is supported by Apache MXNet, correct?
> > > > >
> > > > > If yes, I propose testing only the latest CUDA9 and the respective
> > > latest
> > > > > cuDNN version in the MXNet CI since CUDA9 is backward compatible
> with
> > > > > earlier Nvidia hardware generations.
> > > > >
> > > > > I would like to hear reasons why this would not work.
> > > > >
> > > > > I have commented on the github issue as well:
> > > > > https://github.com/apache/incubator-mxnet/issues/8805
> > > > >
> > > > > Bhavin Thaker.
> > > > >
> > > > > On Sat, Jan 6, 2018 at 3:30 AM kellen sunderland <
> > > > > [email protected]> wrote:
> > > > >
> > > > > > Hello all, I'd like to propose that we nail down exactly which
> > > versions
> > > > > of
> > > > > > CUDA we're supporting.  We can then ensure that we've got good
> test
> > > > > > coverage for those specific versions in CI.  At the moment it's
> > > > ambiguous
> > > > > > what our current policy is.  I.e. when do we drop support for old
> > > > > > versions?  As a result we potentially cut a release promising to
> > > > support
> > > > > a
> > > > > > certain version of CUDA, then retroactively drop support after we
> > > find
> > > > an
> > > > > > issue.
> > > > > >
> > > > > > I'd like to propose that we officially support N, and N-1
> versions
> > of
> > > > > CUDA,
> > > > > > where N is the most recent major version release.  In addition we
> > can
> > > > do
> > > > > > our best to support libraries that are available for download for
> > > those
> > > > > > versions.  Supporting these CUDA versions would also dictate
> which
> > > > > hardware
> > > > > > we support in terms of compute capability (of course resource
> > > > constraints
> > > > > > would also play some role in our ability to support some
> hardware).
> > > > > >
> > > > > > As an example this would mean that currently we'd officially
> > support
> > > > CUDA
> > > > > > 9.* and 8.  This would imply we support CUDNN 5.1 through 7, as
> > those
> > > > > > libraries are available for CUDA 8, and 9.  It would also mean we
> > > > support
> > > > > > 3.0-7.x (Kepler, Maxwell, Pascal, Volta) taking the more
> > restrictive
> > > > > > hardware requirements of CUDA 9 into account.
> > > > > >
> > > > > > What do you all think?  Would this be a reasonable support
> > strategy?
> > > > Are
> > > > > > these the versions you'd like to see covered in CI?
> > > > > >
> > > > > > -Kellen
> > > > > >
> > > > > > A relevant issue:
> > > > https://github.com/apache/incubator-mxnet/issues/8805
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: CUDA Support [DISCUSS]

Reply via email to