Hi Satish,
thanks for the pull request. I approve the changes, improved appending the
-Wno-deprecated-gpu-targets to also work on my machine, and have merged
everything to next.
* My fixes should alleviate some of the CUSP installation issues. I
don't know enough about CUSP interface wrt the useful features vs
other burderns - and if its good to drop it or not. [If needed - we
can add in more version dependencies in configure]
This should be fine for now. In the long term CUSP may be completely
superseded by NVIDIA's AMGX. Let's see how things develop...
* Wrt CUDA - currently my test is with CUDA-7.5. I can try migrating a
couple of tests to CUDA-9.1 [on frog]. But what about older
releases? Any reason we should drop them? I.e any reason to up the
following values?
self.CUDAMinVersion = '5000' # Minimal cuda version is 5.0
self.CUSPMinVersion = '400' # Minimal cusp version is 0.4
See the answer here for a list of CUDA capabilities and defaults:
https://stackoverflow.com/questions/28932864/cuda-compute-capability-requirements
We definitely don't need to support compute architecture 1.x (~10 years old),
as there is no double precision support and hence fairly useless for our
purposes. Thus, we should be absolutely fine with requiring CUDA 7.0 or
higher.
We do change it for complex build [we don't have a test for this case]
if self.defaultScalarType.lower() == 'complex': self.CUDAMinVersion =
'7050'
I don't remember the exact reason, but I remember that there is one for
requiring CUDA 7.5 here. Let's use CUDA 7.5 as the minimum for both real and
complex then?
* Our test GUP is M2090 - with Compute capability (version) 2.0.
CUDA-7.5 works on it. CUDA-8 gives deprecated warnings. CUDA-9 does
not work? So what do we do for such old hardware? Do we keep
CUDA-7.5 is the minimum supported version for extended time? [At
some point we could switch to minimum version CUDA-8 - if we can get
rid of the warnings]
Your PR silences the deprecation warnings.
Compute capability 2.0 is fine for our tests for some time to come. We should
certainly upgrade at some point, yet my experience with GPUs is that older
GPUs are actually the better test environment, as they tend to reveal bugs
quicker than newer hardware.
Best regards,
Karli
* BTW: Wrt --with-cuda-arch, I'm hoping we can get rid of it in favor
of CUDAFLAGS [with defaults similar to CFLAGS defaults] - but its
not clear if I can easily untangle the dependencies we have [wrt CPP
- and others]
Or can we get rid of this default alltogether [currently
-arch=sm_20] - and expect nvcc to have sane defaults? Then we can
probably eliminate all this complicated code. [If cuda-7.5 and higer
do this properly - we could use that as the minimum supported version?]
Satish
On Sat, 10 Mar 2018, Karl Rupp wrote:
Hi all,
a couple of notes here, particularly for Manuel:
* CUSP is repeatedly causing such installation problems, hence we will
soon
drop it as a vector backend and instead only provide a native
CUBLAS/CUSPARSE-based backend.
* you can use this native CUDA backend already now. Just configure with
only
--with-cuda=1 --with-cuda-arch=sm_60 (sm_30 should also work and is
compatible
with Tesla K20 GPUs you may find on other clusters).
* The multigrid preconditioner from CUSP is selected via
-pc_type sa_cusp
Make sure you also use -vec_type cusp -mat_type aijcusp
If you don't need the multigrid preconditioner from CUSP, please
just reconfigure and use the native CUDA-backend with -vec_type cuda
-mat_type
aijcusparse
* Right now only one of {native CUDA, CUSP, ViennaCL} can be activated at
configure time. This will be fixed later this month.
If you're looking for a GPU-accelerated multigrid preconditioner: I just
heard
yesterday that NVIDIA's AMGX is now open source. I'll provide a wrapper
within
PETSc soon.
As Matt already said: Don't expect much more than a modest speedup over
your
existing CPU-based code - provided that your setup is GPU-friendly and your
problem size is appropriate.
Best regards,
Karli
On 03/10/2018 03:38 AM, Satish Balay wrote:
I've updated configure so that --download-cusp gets the
correct/compatible cusp version - for cuda 7,8 vs 9
The changes are in branch balay/cuda-cusp-cleanup - and merged to next.
Satish
On Wed, 7 Mar 2018, Satish Balay wrote:
--download-cusp gets hardly ever used so likely broken.
It needs to be updated to somehow use the correct cusp version based
on the cuda version thats being used.
[and since we can't easily check for cusp compatibility - we should
probably
remove checkCUSPVersion() code]
When using Cuda-9 - you can try options:
--download-cusp=1 --download-cusp-commit=116b090