I think that sounds great, I'm happy to put together an MR (likely next week) for review.
On Wed, 26 Feb 2025 at 16:11, Junchao Zhang <junchao.zh...@gmail.com> wrote: > This fuction *MatCreateMPIAIJWithSeqAIJ(MPI_Comm comm, Mat A, Mat B, > const PetscInt garray[], Mat *mat) *is rarely used. To compute the global > matrix's row/col size M, N, it has to do an MPI_Allreduce(). I think it is > a waste, as the caller usually knows M, N already. So I think we can depart > from it and have a new one: > > MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, PetscInt M, PetscInt > N, Mat A, Mat B, const PetscInt garray[], Mat *mat) > * M, N are global row/col size of mat > * A, B are MATSEQAIJKOKKOS > * M, N could be PETSC_DECIDE, if so, petsc will compute mat's M, N from A, > i.e., M = Sum of A's M, N= Sum of A's N > * if garray is NULL, B uses global column indices (and B's N should be > equal to the output mat's N) > * if garray is not NULL, B uses local column indices; garray[] was > allocated by PetscMalloc() and after the call, garray will be owned by mat > (user should not free garray afterwards). > > What do you think? If you agree, could you contribute an MR? > > BTW, I think we need to create a new header, petscmat_kokkos.hpp to declare > PetscErrorCode MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm comm, > KokkosCsrMatrix csr, Mat *A) > but > PetscErrorCode MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, > PetscInt M, PetscInt N, Mat A, Mat B, const PetscInt garray[], Mat *mat) > can be in petscmat.h as it uses only C types. > > Barry, what do you think of the two new APIs? > > --Junchao Zhang > > > On Wed, Feb 26, 2025 at 6:26 AM Steven Dargaville < > dargaville.ste...@gmail.com> wrote: > >> Those two constructors would definitely meet my needs, thanks! >> >> Also I should note that the comment about garray and B in >> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices is correct if garray is >> passed in as NULL, it's just that if you pass in a completed garray it >> doesn't bother creating one or changing the column indices of B. So I would >> suggest the comment be: "if garray is NULL the offdiag matrix B should >> have global col ids; if garray is not NULL the offdiag matrix B should have >> local col ids" >> >> On Wed, 26 Feb 2025 at 03:35, Junchao Zhang <junchao.zh...@gmail.com> >> wrote: >> >>> Mat_SeqAIJKokkos is private because it is in a private header >>> src/mat/impls/aij/seq/kokkos/aijkok.hpp >>> >>> Your observation about the garray in >>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices() >>> might be right. The comment >>> >>> - B - the offdiag matrix using global col ids >>> >>> is out of date. Perhaps it should be "the offdiag matrix uses local >>> column indices and garray contains the local to global mapping". But I >>> need to double check it. >>> >>> Since you use Kokkos, I think we could provide these two constructors >>> for MATSEQAIJKOKKOS and MATMPIAIJKOKKOS respectively >>> >>> - MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm comm, >>> KokkosCsrMatrix csr, Mat *A) >>> >>> >>> - MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, Mat A, Mat B, >>> PetscInt *garray, Mat *mat) >>> >>> // To mimic the existing MatCreateMPIAIJWithSeqAIJ(MPI_Comm >>> comm, Mat A, Mat B, const PetscInt garray[], Mat *mat); >>> // A and B are MATSEQAIJKOKKOS matrices and use local column >>> indices >>> >>> Do they meet your needs? >>> >>> --Junchao Zhang >>> >>> >>> On Tue, Feb 25, 2025 at 5:35 PM Steven Dargaville < >>> dargaville.ste...@gmail.com> wrote: >>> >>>> Thanks for the response! >>>> >>>> Although MatSetValuesCOO happens on the device if the input coo_v >>>> pointer is device memory, I believe MatSetPreallocationCOO requires host >>>> pointers for coo_i and coo_j, and the preallocation (and construction of >>>> the COO structures) happens on the host and is then copied onto the device? >>>> I need to be able to create a matrix object with minimal work on the host >>>> (like many of the routines in aijkok.kokkos.cxx do internally). I >>>> originally used the COO interface to build the matrices I need, but that >>>> was around 5x slower than constructing the aij structures myself on the >>>> device and then just directly using the MatSetSeqAIJKokkosWithCSRMatrix >>>> type methods. >>>> >>>> The reason I thought MatSetSeqAIJKokkosWithCSRMatrix could easily be >>>> made public is that the Mat_SeqAIJKokkos constructors are already publicly >>>> accessible? In particular one of those constructors takes in pointers to >>>> the Kokkos dual views which store a,i,j, and hence one can build a >>>> sequential matrix with nothing (or very little) occuring on the host. The >>>> only change I can see that would be necessary is for >>>> MatSetSeqAIJKokkosWithCSRMatrix (or MatCreateSeqAIJKokkosWithCSRMatrix) to >>>> be public is to change the PETSC_INTERN to PETSC_EXTERN? >>>> >>>> For MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, I believe all that >>>> is required is declaring the method in the .hpp, as it's already defined as >>>> static in mpiaijkok.kokkos.cxx. In particular, the comments >>>> above MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices suggest that the >>>> off-diagonal block B needs to be built with global column ids, with >>>> mpiaij->garray constructed on the host along with the rewriting of the >>>> global column indices in B. This happens in MatSetUpMultiply_MPIAIJ, but >>>> checking the code there shows that if you pass in a non-null garray to >>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, the construction and >>>> compatification is skipped, meaning B can be built with local column ids as >>>> long as garray is provided on the host (which I also build on the device >>>> and then just copy to the host). Again this is what some of the internal >>>> Kokkos routines rely on, like the matrix-product. >>>> >>>> I am happy to try doing this and submitting a request to the petsc >>>> gitlab if this seems sensible, I just wanted to double check that I wasn't >>>> missing something important? >>>> Thanks >>>> Steven >>>> >>>> On Tue, 25 Feb 2025 at 22:16, Junchao Zhang <junchao.zh...@gmail.com> >>>> wrote: >>>> >>>>> Hi, Steven, >>>>> MatSetSeqAIJKokkosWithCSRMatrix(Mat A, Mat_SeqAIJKokkos *akok) uses >>>>> a private data type Mat_SeqAIJKokkos, so it can not be directly made >>>>> public. >>>>> If you already use COO, then why not directly make the matrix of >>>>> type MATAIJKOKKOS and call MatSetPreallocationCOO() and MatSetValuesCOO()? >>>>> So I am confused by your needs. >>>>> >>>>> Thanks! >>>>> --Junchao Zhang >>>>> >>>>> >>>>> On Tue, Feb 25, 2025 at 3:39 PM Steven Dargaville < >>>>> dargaville.ste...@gmail.com> wrote: >>>>> >>>>>> Hi >>>>>> >>>>>> I'm just wondering if there is any possibility of making: >>>>>> MatSetSeqAIJKokkosWithCSRMatrix or MatCreateSeqAIJKokkosWithCSRMatrix >>>>>> in src/mat/impls/aij/seq/kokkos/aijkok.kokkos.cxx >>>>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices in >>>>>> src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx >>>>>> >>>>>> publicly accessible outside of petsc, or if there is an interface I >>>>>> have missed for creating Kokkos matrices entirely on the device? >>>>>> MatCreateSeqAIJKokkosWithCSRMatrix for example is marked as PETSC_INTERN >>>>>> so >>>>>> I can't link to it. >>>>>> >>>>>> I've currently just copied the code inside of those methods so that I >>>>>> can build without any preallocation on the host (e.g., through the COO >>>>>> interface) and it works really well. >>>>>> >>>>>> Thanks for your help >>>>>> Steven >>>>>> >>>>>