Re: [ViennaCL-support] Is it possible to apply ViennaCL to my problem?

Karl Rupp Sun, 12 Sep 2021 01:57:44 -0700

Hello,

thank you for describing your application in great detail! With a systemsize of 2000 to 50000 unknowns you are most likely better off withstaying on the CPU (assuming that your system is indeed rather sparsewith less than about 100 nonzeros per row on average). This is becauseeach GPU kernel launch involves a couple of microseconds of latency;this doesn't sound much, but it accumulates over many kernel launches.

Also, with multiple right hand sides I recommend to compute a sparse LUfactorization (PARDISO, SuperLU, etc.), and then apply thisfactorization for each of the right hand sides. This will be moreefficient than calling iterative solvers (which is the standard approachfor GPUs). Sparse factorizations on the GPU don't really work that welland to the best of my knowledge just match those in equally powerful(with similar energy consumption) CPUs.

Regarding symmetry: You can use the symmetry to compute a sparseCholesky factorization instead of an LU factorization. This, again, fitsbetter onto a CPU than a GPU.

Overall, I *think* that you can use the same parallelization approaches(esp. datastructures) for the GPU to also speed up your CPU code(OpenMP, MPI, etc.). In terms of solving these systems, sparse directsolvers on the CPU will be hard to beat at the system sizes youmentioned. Productivity-wise, your best option is most likely to staywith the CPU and don't worry about GPUs for this particular problem.


Best regards,
Karli

On 9/10/21 15:28, Arno Gehrer wrote:

Good afternoon!
Maybe you can support me to find out if it would make sense to applyViennaCL to my problem?
Background:
·In the context of a reverse engineering problem I need to solve alinear system of equations.The number of unknowns is in the range of n=2000 … 50000 and the systemneeds to be solved a lot of times within an iteration loop.
·The matrix is symmetric, hence only the upper triangle is stored incompressed CSR format
·I need to solve this system with multiple right hand side vectors.
·At present, I’m using Intel MKL / PARDISO to solve the linear systemwith mtype = 2 (real and symmetric positive definite) or -2 (in somecases, the matrix is real and symmetric indefinite) which works very well.
·Recently, I managed to speed up the whole algorithm by setting up thesystem on the GPU with CUDA and I’m looking for a suitable library tosolve the system on the GPU as well.
oI have already tried to solve the system with cusparse (usingcusolverSpDcsrlsvchol or cusolverSpDcsrlsvqr) which in principle worked.I have faced the problem that I did not find a possibility tosimultaneously solve multiple right hand sides and also the symmetricproperty is not supported for cusolverSp. So I had to extend the matrixto a full matrix and to solve the system for each rhs which in total wasmuch slower than solving the system on the CPU by means of PARDISO.
So, after this lengthy introduction, my question is:
Is it possible to apply ViennaCL to such a problem and can I expect asignificant speed up compared to mkl?
·The perfect solution would be if I directly could transfer the matrixin csr format and the rhs vectors (which are all stored in GPU memory)to a suitable solver that replaces PARDISO, mtype 2,2 (I currently copythese data to the host and pass it to PARDISO)
My environment for development is Win10(x64) / Visual Studio 2019 / MKL2017 / CUDA 11.2 and the code also compiles on Linux where CUDA 7.5 isinstalled.
Thanks for your feedback,

Arno Gehrer



_______________________________________________
ViennaCL-support mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/viennacl-support



_______________________________________________
ViennaCL-support mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/viennacl-support

Re: [ViennaCL-support] Is it possible to apply ViennaCL to my problem?

Reply via email to