Re: [petsc-dev] [petsc-maint] Installing with CUDA on a cluster

Karl Rupp Mon, 12 Mar 2018 11:46:48 -0700

Hi Satish,

diff --git a/config/BuildSystem/config/packages/cuda.py 
b/config/BuildSystem/config/packages/cuda.py
index f5d1395e54..b80ef88c35 100644
--- a/config/BuildSystem/config/packages/cuda.py
+++ b/config/BuildSystem/config/packages/cuda.py
@@ -13,7 +13,7 @@ class Configure(config.package.Package):
      self.complex          = 1
      self.cudaArch         = ''
      self.CUDAVersion      = ''
-    self.CUDAMinVersion   = '5000' # Minimal cuda version is 5.0
+    self.CUDAMinVersion   = '7050' # Minimal cuda version is 7.5
      self.hastests         = 0
      self.hastestsdatafiles= 0
      return
@@ -160,7 +160,6 @@ class Configure(config.package.Package):

def configureLibrary(self):

      config.package.Package.configureLibrary(self)
-    if self.defaultScalarType.lower() == 'complex': self.CUDAMinVersion = 
'7050'
      self.checkCUDAVersion()
      self.checkNVCCDoubleAlign()
      self.configureTypes()
diff --git a/config/BuildSystem/config/packages/cusp.py 
b/config/BuildSystem/config/packages/cusp.py
index e6bf4cc118..9ed82f7e78 100644
--- a/config/BuildSystem/config/packages/cusp.py
+++ b/config/BuildSystem/config/packages/cusp.py
@@ -13,7 +13,7 @@ class Configure(config.package.Package):
      self.cxx             = 0
      self.complex         = 0   # Currently CUSP with complex numbers is not 
supported
      self.CUSPVersion     = ''
-    self.CUSPMinVersion  = '400' # Minimal cusp version is 0.4
+    self.CUSPMinVersion  = '500' # Minimal cusp version is 0.5.0
      return

def setupDependencies(self, framework):

<<<<


yep, that's good.

One issue is - I do not know if we have any users using [or requring]
older version. [due to code issues or cuda+os compatibility issues]

And same with cuda.

So I guess we could keep the current defaults - and update them when
any relavent issues come up..

well, our GPU stuff is disabled in the release, so it should be clearthat these things are in the flux. For GPU systems it's reasonable torequire a software stack that is not more than ~3 years old,particularly since driver updates are much easier than full OS upgrades.


Best regards,
Karli


And will plan on using M2090 testbed [with either cuda-7.5 or cuda-8
-arch=sm_20] for forseeable future.

Satish

On Mon, 12 Mar 2018, Karl Rupp wrote:

Hi Satish,

thanks for the pull request. I approve the changes, improved appending the
-Wno-deprecated-gpu-targets to also work on my machine, and have merged
everything to next.

* My fixes should alleviate some of the CUSP installation issues. I
    don't know enough about CUSP interface wrt the useful features vs
    other burderns - and if its good to drop it or not. [If needed - we
    can add in more version dependencies in configure]


This should be fine for now. In the long term CUSP may be completely
superseded by NVIDIA's AMGX. Let's see how things develop...

* Wrt CUDA - currently my test is with CUDA-7.5. I can try migrating a
    couple of tests to CUDA-9.1 [on frog]. But what about older
    releases?  Any reason we should drop them? I.e any reason to up the
    following values?

      self.CUDAMinVersion   = '5000' # Minimal cuda version is 5.0
      self.CUSPMinVersion  = '400' # Minimal cusp version is 0.4


See the answer here for a list of CUDA capabilities and defaults:
https://stackoverflow.com/questions/28932864/cuda-compute-capability-requirements

We definitely don't need to support compute architecture 1.x (~10 years old),
as there is no double precision support and hence fairly useless for our
purposes. Thus, we should be absolutely fine with requiring CUDA 7.0 or
higher.


    We do change it for complex build [we don't have a test for this case]

      if self.defaultScalarType.lower() == 'complex': self.CUDAMinVersion =
      '7050'


I don't remember the exact reason, but I remember that there is one for
requiring CUDA 7.5 here. Let's use CUDA 7.5 as the minimum for both real and
complex then?

* Our test GUP is M2090 - with Compute capability (version) 2.0.
    CUDA-7.5 works on it. CUDA-8 gives deprecated warnings. CUDA-9 does
    not work? So what do we do for such old hardware? Do we keep
    CUDA-7.5 is the minimum supported version for extended time? [At
    some point we could switch to minimum version CUDA-8 - if we can get
    rid of the warnings]


Your PR silences the deprecation warnings.
Compute capability 2.0 is fine for our tests for some time to come. We should
certainly upgrade at some point, yet my experience with GPUs is that older
GPUs are actually the better test environment, as they tend to reveal bugs
quicker than newer hardware.

Best regards,
Karli


* BTW: Wrt --with-cuda-arch, I'm hoping we can get rid of it in favor
    of CUDAFLAGS [with defaults similar to CFLAGS defaults] - but its
    not clear if I can easily untangle the dependencies we have [wrt CPP
    - and others]

Or can we get rid of this default alltogether [currently

    -arch=sm_20] - and expect nvcc to have sane defaults? Then we can
    probably eliminate all this complicated code. [If cuda-7.5 and higer
    do this properly - we could use that as the minimum supported version?]

Satish

On Sat, 10 Mar 2018, Karl Rupp wrote:

Hi all,

a couple of notes here, particularly for Manuel:

   * CUSP is repeatedly causing such installation problems, hence we will
soon
drop it as a vector backend and instead only provide a native
CUBLAS/CUSPARSE-based backend.

   * you can use this native CUDA backend already now. Just configure with
only
--with-cuda=1 --with-cuda-arch=sm_60 (sm_30 should also work and is
compatible
with Tesla K20 GPUs you may find on other clusters).

   * The multigrid preconditioner from CUSP is selected via
     -pc_type sa_cusp
     Make sure you also use -vec_type cusp -mat_type aijcusp
     If you don't need the multigrid preconditioner from CUSP, please
just reconfigure and use the native CUDA-backend with -vec_type cuda
-mat_type
aijcusparse

   * Right now only one of {native CUDA, CUSP, ViennaCL} can be activated at
configure time. This will be fixed later this month.

If you're looking for a GPU-accelerated multigrid preconditioner: I just
heard
yesterday that NVIDIA's AMGX is now open source. I'll provide a wrapper
within
PETSc soon.

As Matt already said: Don't expect much more than a modest speedup over
your
existing CPU-based code - provided that your setup is GPU-friendly and your
problem size is appropriate.

Best regards,
Karli




On 03/10/2018 03:38 AM, Satish Balay wrote:

I've updated configure so that --download-cusp gets the
correct/compatible cusp version - for cuda 7,8 vs 9

The changes are in branch balay/cuda-cusp-cleanup - and merged to next.

Satish

On Wed, 7 Mar 2018, Satish Balay wrote:

--download-cusp gets hardly ever used so likely broken.

It needs to be updated to somehow use the correct cusp version based
on the cuda version thats being used.

[and since we can't easily check for cusp compatibility - we should
probably
remove checkCUSPVersion() code]

When using Cuda-9 - you can try options:

--download-cusp=1 --download-cusp-commit=116b090

Re: [petsc-dev] [petsc-maint] Installing with CUDA on a cluster

Reply via email to