Re: [petsc-dev] Why no SpGEMM support in AIJCUSPARSE and AIJVIENNACL?

Karl Rupp via petsc-dev Fri, 04 Oct 2019 02:55:16 -0700

Hi Richard,

Do you have any experience with nsparse?
https://github.com/EBD-CREST/nsparse

I've seen claims that it is much faster than cuSPARSE for sparse
matrix-matrix products.
I haven't tried nsparse, no.
But since the performance comes from a hardware feature (cache), Iwould be surprised if there is a big performance leap over ViennaCL.(There's certainly some potential for some tweaking of ViennaCL'skernels; but note that even ViennaCL is much faster than cuSPARSE'sspGEMM on average).
With the libaxb-wrapper we can just add nsparse as an operationsbackend and then easily try it out and compare against the otherpackages. In the end it doesn't matter which package provides the bestperformance; we just want to leverage it :-)
I'd be happy to add support for this (though I suppose I should playwith it first to verify that it is, in fact, worthwhile). Karl, is yourbranch with libaxb ready for people to start using it, or should we waitfor you to do more with it? (Or, would you like any help with it?)

I still need to add the matrix class to the merge request. Should onlytake me a couple of hours, but I've got an extremely important deadlineon October 15 that will prevent me from doing anything before then.

I'd like to try to add support for a few things like cuSPARSE SpGEMMbefore I go to the Summit hackathon, but I don't want to write a bunchof code that will be thrown away once your libaxb approach is in place.

I should be able to provide a good playground on time for the Summithackathon. In the meantime you can try the matrix market reader ofnsparse directly and see what you get, especially compared to cuSPARSEand MKL.


Best regards,
Karli

Karl Rupp via petsc-dev <[email protected]> writes:
Hi Richard,

CPU spGEMM is about twice as fast even on the GPU-friendly case of a
single rank:http://viennacl.sourceforge.net/viennacl-benchmarks-spmm.html
I agree that it would be good to have a GPU-MatMatMult for the sake of
experiments. Under these performance constraints it's not top priority,
though.

Best regards,
Karli


On 10/3/19 12:00 AM, Mills, Richard Tran via petsc-dev wrote:
Fellow PETSc developers,

I am wondering why the AIJCUSPARSE and AIJVIENNACL matrix types do not
support the sparse matrix-matrix multiplication (SpGEMM, orMatMatMult()
in PETSc parlance) routines provided by cuSPARSE and ViennaCL,
respectively. Is there a good reason that I shouldn't add those? My
guess is that support was not added because SpGEMM is hard to dowell ona GPU compared to many CPUs (it is hard to compete with, say, IntelXeon
CPUs with their huge caches) and it has been the case that one would
generally be better off doing these operations on the CPU. Since the
trend at the big supercomputing centers seems to be to put more andmoreof the computational power into GPUs, I'm thinking that I shouldadd theoption to use the GPU library routines for SpGEMM, though. Is theresomegood reason to *not* do this that I am not aware of? (Maybe theCPUs arebetter for this even on a machine like Summit, but I think we're atthe
point that we should at least be able to experimentally verify this.)

--Richard

Re: [petsc-dev] Why no SpGEMM support in AIJCUSPARSE and AIJVIENNACL?

Reply via email to