On Wed, Feb 26, 2025 at 11:27 AM Steven Dargaville < dargaville.ste...@gmail.com> wrote:
> Ok so just to double check the things I should do: > > 1. Create a new header for MatCreateSeqAIJKokkosWithCSRMatrix (and declare > it PETSC_EXTERN) so users can call the existing method and build a > seqaijkokkos matrix with no host involvement. > No, We already have a private MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm comm, KokkosCsrMatrix csr, Mat *A), you just need to make it public in a new header petscmat_kokkos.hpp. BTW, I am also thinking MatCreateSeqAIJKokkosWithKokkosViews(MPI_Comm comm, PetscInt m, PetscInt n, Kokkos::View<PetscInt*> i, Kokkos::View<PetscInt*> j, Kokkos::View<PetscScalar*> a, Mat *mat), as we already have MatCreateSeqAIJWithArrays(MPI_Comm comm, PetscInt m, PetscInt n, PetscInt i[], PetscInt j[], PetscScalar a[], Mat *mat) The benefit is that we don't need to include <KokkosSparse_CrsMatrix.hpp> in petscmat_kokkos.hpp, to decouple petsc and kokkos to the least. But either is fine with me. > > 2. Modify *MatCreateMPIAIJWithSeqAIJ (*or equivalent*) *so it does the > same thing as MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices in the case > that A and B are seqaijkokkos matrices. > Keep the existing MatCreateMPIAIJWithSeqAIJ() but depreciate it in favor of a new MatCreateMPIXAIJWithSeqXAIJ(MPI_Comm comm, PetscInt M, PetscInt N, Mat A, Mat B, const PetscInt garray[], Mat *mat). The new function should handle cases that the A, B are MATSEQAIJKOKKOS. > > 3. Potentially remove MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices > given it would be redundant? > Yes, remove it and use the new API at places calling it. > > On Wed, 26 Feb 2025 at 17:15, Junchao Zhang <junchao.zh...@gmail.com> > wrote: > >> That is a good idea. Perhaps a new MatCreateMPIXAIJWithSeqXAIJ(MPI_Comm >> comm, PetscInt M, PetscInt N, Mat A, Mat B, const PetscInt garray[], Mat >> *mat) since garray[] is only meaningful for MATAIJ and subclasses. >> >> --Junchao Zhang >> >> >> On Wed, Feb 26, 2025 at 11:02 AM Barry Smith <bsm...@petsc.dev> wrote: >> >>> >>> The new function doesn't seem to have anything to do with Kokkos so >>> why have any new functions? Just have *MatCreateMPIAIJWithSeqAIJ() work >>> properly when the two matrices are Kokkos (or CUDA or HIP). Or if you >>> want to eliminate the global reduction maybe make your new function >>> MatCreateMPIWithSeq() and have it work for any type of submatrix and >>> eventually we could deprecate the **MatCreateMPIAIJWithSeqAIJ() * >>> >>> * Barry* >>> >>> >>> >>> >>> On Feb 26, 2025, at 11:54 AM, Steven Dargaville < >>> dargaville.ste...@gmail.com> wrote: >>> >>> I think that sounds great, I'm happy to put together an MR (likely next >>> week) for review. >>> >>> On Wed, 26 Feb 2025 at 16:11, Junchao Zhang <junchao.zh...@gmail.com> >>> wrote: >>> >>>> This fuction *MatCreateMPIAIJWithSeqAIJ(MPI_Comm comm, Mat A, Mat B, >>>> const PetscInt garray[], Mat *mat) *is rarely used. To compute the >>>> global matrix's row/col size M, N, it has to do an MPI_Allreduce(). I think >>>> it is a waste, as the caller usually knows M, N already. So I think we can >>>> depart from it and have a new one: >>>> >>>> MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, PetscInt M, >>>> PetscInt N, Mat A, Mat B, const PetscInt garray[], Mat *mat) >>>> * M, N are global row/col size of mat >>>> * A, B are MATSEQAIJKOKKOS >>>> * M, N could be PETSC_DECIDE, if so, petsc will compute mat's M, N from >>>> A, i.e., M = Sum of A's M, N= Sum of A's N >>>> * if garray is NULL, B uses global column indices (and B's N should be >>>> equal to the output mat's N) >>>> * if garray is not NULL, B uses local column indices; garray[] was >>>> allocated by PetscMalloc() and after the call, garray will be owned by mat >>>> (user should not free garray afterwards). >>>> >>>> What do you think? If you agree, could you contribute an MR? >>>> >>>> BTW, I think we need to create a new header, petscmat_kokkos.hpp to >>>> declare >>>> PetscErrorCode MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm >>>> comm, KokkosCsrMatrix csr, Mat *A) >>>> but >>>> PetscErrorCode MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, >>>> PetscInt M, PetscInt N, Mat A, Mat B, const PetscInt garray[], Mat *mat) >>>> can be in petscmat.h as it uses only C types. >>>> >>>> Barry, what do you think of the two new APIs? >>>> >>>> --Junchao Zhang >>>> >>>> >>>> On Wed, Feb 26, 2025 at 6:26 AM Steven Dargaville < >>>> dargaville.ste...@gmail.com> wrote: >>>> >>>>> Those two constructors would definitely meet my needs, thanks! >>>>> >>>>> Also I should note that the comment about garray and B in >>>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices is correct if garray >>>>> is passed in as NULL, it's just that if you pass in a completed garray it >>>>> doesn't bother creating one or changing the column indices of B. So I >>>>> would >>>>> suggest the comment be: "if garray is NULL the offdiag matrix B >>>>> should have global col ids; if garray is not NULL the offdiag matrix B >>>>> should have local col ids" >>>>> >>>>> On Wed, 26 Feb 2025 at 03:35, Junchao Zhang <junchao.zh...@gmail.com> >>>>> wrote: >>>>> >>>>>> Mat_SeqAIJKokkos is private because it is in a private header >>>>>> src/mat/impls/aij/seq/kokkos/aijkok.hpp >>>>>> >>>>>> Your observation about the garray in >>>>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices() >>>>>> might be right. The comment >>>>>> >>>>>> - B - the offdiag matrix using global col ids >>>>>> >>>>>> is out of date. Perhaps it should be "the offdiag matrix uses local >>>>>> column indices and garray contains the local to global mapping". But I >>>>>> need to double check it. >>>>>> >>>>>> Since you use Kokkos, I think we could provide these two constructors >>>>>> for MATSEQAIJKOKKOS and MATMPIAIJKOKKOS respectively >>>>>> >>>>>> - MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm comm, >>>>>> KokkosCsrMatrix csr, Mat *A) >>>>>> >>>>>> >>>>>> - MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, Mat A, Mat >>>>>> B, PetscInt *garray, Mat *mat) >>>>>> >>>>>> // To mimic the existing MatCreateMPIAIJWithSeqAIJ(MPI_Comm >>>>>> comm, Mat A, Mat B, const PetscInt garray[], Mat *mat); >>>>>> // A and B are MATSEQAIJKOKKOS matrices and use local column >>>>>> indices >>>>>> >>>>>> Do they meet your needs? >>>>>> >>>>>> --Junchao Zhang >>>>>> >>>>>> >>>>>> On Tue, Feb 25, 2025 at 5:35 PM Steven Dargaville < >>>>>> dargaville.ste...@gmail.com> wrote: >>>>>> >>>>>>> Thanks for the response! >>>>>>> >>>>>>> Although MatSetValuesCOO happens on the device if the input coo_v >>>>>>> pointer is device memory, I believe MatSetPreallocationCOO requires host >>>>>>> pointers for coo_i and coo_j, and the preallocation (and construction of >>>>>>> the COO structures) happens on the host and is then copied onto the >>>>>>> device? >>>>>>> I need to be able to create a matrix object with minimal work on the >>>>>>> host >>>>>>> (like many of the routines in aijkok.kokkos.cxx do internally). I >>>>>>> originally used the COO interface to build the matrices I need, but that >>>>>>> was around 5x slower than constructing the aij structures myself on the >>>>>>> device and then just directly using the MatSetSeqAIJKokkosWithCSRMatrix >>>>>>> type methods. >>>>>>> >>>>>>> The reason I thought MatSetSeqAIJKokkosWithCSRMatrix could easily be >>>>>>> made public is that the Mat_SeqAIJKokkos constructors are already >>>>>>> publicly >>>>>>> accessible? In particular one of those constructors takes in pointers to >>>>>>> the Kokkos dual views which store a,i,j, and hence one can build a >>>>>>> sequential matrix with nothing (or very little) occuring on the host. >>>>>>> The >>>>>>> only change I can see that would be necessary is for >>>>>>> MatSetSeqAIJKokkosWithCSRMatrix (or MatCreateSeqAIJKokkosWithCSRMatrix) >>>>>>> to >>>>>>> be public is to change the PETSC_INTERN to PETSC_EXTERN? >>>>>>> >>>>>>> For MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, I believe all >>>>>>> that is required is declaring the method in the .hpp, as it's already >>>>>>> defined as static in mpiaijkok.kokkos.cxx. In particular, the comments >>>>>>> above MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices suggest that the >>>>>>> off-diagonal block B needs to be built with global column ids, with >>>>>>> mpiaij->garray constructed on the host along with the rewriting of the >>>>>>> global column indices in B. This happens in MatSetUpMultiply_MPIAIJ, but >>>>>>> checking the code there shows that if you pass in a non-null garray to >>>>>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, the construction and >>>>>>> compatification is skipped, meaning B can be built with local column >>>>>>> ids as >>>>>>> long as garray is provided on the host (which I also build on the device >>>>>>> and then just copy to the host). Again this is what some of the internal >>>>>>> Kokkos routines rely on, like the matrix-product. >>>>>>> >>>>>>> I am happy to try doing this and submitting a request to the petsc >>>>>>> gitlab if this seems sensible, I just wanted to double check that I >>>>>>> wasn't >>>>>>> missing something important? >>>>>>> Thanks >>>>>>> Steven >>>>>>> >>>>>>> On Tue, 25 Feb 2025 at 22:16, Junchao Zhang <junchao.zh...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi, Steven, >>>>>>>> MatSetSeqAIJKokkosWithCSRMatrix(Mat A, Mat_SeqAIJKokkos *akok) uses >>>>>>>> a private data type Mat_SeqAIJKokkos, so it can not be directly made >>>>>>>> public. >>>>>>>> If you already use COO, then why not directly make the matrix of >>>>>>>> type MATAIJKOKKOS and call MatSetPreallocationCOO() and >>>>>>>> MatSetValuesCOO()? >>>>>>>> So I am confused by your needs. >>>>>>>> >>>>>>>> Thanks! >>>>>>>> --Junchao Zhang >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Feb 25, 2025 at 3:39 PM Steven Dargaville < >>>>>>>> dargaville.ste...@gmail.com> wrote: >>>>>>>> >>>>>>>>> Hi >>>>>>>>> >>>>>>>>> I'm just wondering if there is any possibility of making: >>>>>>>>> MatSetSeqAIJKokkosWithCSRMatrix >>>>>>>>> or MatCreateSeqAIJKokkosWithCSRMatrix in >>>>>>>>> src/mat/impls/aij/seq/kokkos/aijkok.kokkos.cxx >>>>>>>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices in >>>>>>>>> src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx >>>>>>>>> >>>>>>>>> publicly accessible outside of petsc, or if there is an interface >>>>>>>>> I have missed for creating Kokkos matrices entirely on the device? >>>>>>>>> MatCreateSeqAIJKokkosWithCSRMatrix for example is marked as >>>>>>>>> PETSC_INTERN so >>>>>>>>> I can't link to it. >>>>>>>>> >>>>>>>>> I've currently just copied the code inside of those methods so >>>>>>>>> that I can build without any preallocation on the host (e.g., through >>>>>>>>> the >>>>>>>>> COO interface) and it works really well. >>>>>>>>> >>>>>>>>> Thanks for your help >>>>>>>>> Steven >>>>>>>>> >>>>>>>> >>>