That is a good idea. Perhaps a new MatCreateMPIXAIJWithSeqXAIJ(MPI_Comm comm, PetscInt M, PetscInt N, Mat A, Mat B, const PetscInt garray[], Mat *mat) since garray[] is only meaningful for MATAIJ and subclasses.
--Junchao Zhang On Wed, Feb 26, 2025 at 11:02 AM Barry Smith <bsm...@petsc.dev> wrote: > > The new function doesn't seem to have anything to do with Kokkos so > why have any new functions? Just have *MatCreateMPIAIJWithSeqAIJ() work > properly when the two matrices are Kokkos (or CUDA or HIP). Or if you > want to eliminate the global reduction maybe make your new function > MatCreateMPIWithSeq() and have it work for any type of submatrix and > eventually we could deprecate the **MatCreateMPIAIJWithSeqAIJ() * > > * Barry* > > > > > On Feb 26, 2025, at 11:54 AM, Steven Dargaville < > dargaville.ste...@gmail.com> wrote: > > I think that sounds great, I'm happy to put together an MR (likely next > week) for review. > > On Wed, 26 Feb 2025 at 16:11, Junchao Zhang <junchao.zh...@gmail.com> > wrote: > >> This fuction *MatCreateMPIAIJWithSeqAIJ(MPI_Comm comm, Mat A, Mat B, >> const PetscInt garray[], Mat *mat) *is rarely used. To compute the >> global matrix's row/col size M, N, it has to do an MPI_Allreduce(). I think >> it is a waste, as the caller usually knows M, N already. So I think we can >> depart from it and have a new one: >> >> MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, PetscInt M, PetscInt >> N, Mat A, Mat B, const PetscInt garray[], Mat *mat) >> * M, N are global row/col size of mat >> * A, B are MATSEQAIJKOKKOS >> * M, N could be PETSC_DECIDE, if so, petsc will compute mat's M, N from >> A, i.e., M = Sum of A's M, N= Sum of A's N >> * if garray is NULL, B uses global column indices (and B's N should be >> equal to the output mat's N) >> * if garray is not NULL, B uses local column indices; garray[] was >> allocated by PetscMalloc() and after the call, garray will be owned by mat >> (user should not free garray afterwards). >> >> What do you think? If you agree, could you contribute an MR? >> >> BTW, I think we need to create a new header, petscmat_kokkos.hpp to >> declare >> PetscErrorCode MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm comm, >> KokkosCsrMatrix csr, Mat *A) >> but >> PetscErrorCode MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, >> PetscInt M, PetscInt N, Mat A, Mat B, const PetscInt garray[], Mat *mat) >> can be in petscmat.h as it uses only C types. >> >> Barry, what do you think of the two new APIs? >> >> --Junchao Zhang >> >> >> On Wed, Feb 26, 2025 at 6:26 AM Steven Dargaville < >> dargaville.ste...@gmail.com> wrote: >> >>> Those two constructors would definitely meet my needs, thanks! >>> >>> Also I should note that the comment about garray and B in >>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices is correct if garray is >>> passed in as NULL, it's just that if you pass in a completed garray it >>> doesn't bother creating one or changing the column indices of B. So I would >>> suggest the comment be: "if garray is NULL the offdiag matrix B should >>> have global col ids; if garray is not NULL the offdiag matrix B should have >>> local col ids" >>> >>> On Wed, 26 Feb 2025 at 03:35, Junchao Zhang <junchao.zh...@gmail.com> >>> wrote: >>> >>>> Mat_SeqAIJKokkos is private because it is in a private header >>>> src/mat/impls/aij/seq/kokkos/aijkok.hpp >>>> >>>> Your observation about the garray in >>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices() >>>> might be right. The comment >>>> >>>> - B - the offdiag matrix using global col ids >>>> >>>> is out of date. Perhaps it should be "the offdiag matrix uses local >>>> column indices and garray contains the local to global mapping". But I >>>> need to double check it. >>>> >>>> Since you use Kokkos, I think we could provide these two constructors >>>> for MATSEQAIJKOKKOS and MATMPIAIJKOKKOS respectively >>>> >>>> - MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm comm, >>>> KokkosCsrMatrix csr, Mat *A) >>>> >>>> >>>> - MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, Mat A, Mat >>>> B, PetscInt *garray, Mat *mat) >>>> >>>> // To mimic the existing MatCreateMPIAIJWithSeqAIJ(MPI_Comm >>>> comm, Mat A, Mat B, const PetscInt garray[], Mat *mat); >>>> // A and B are MATSEQAIJKOKKOS matrices and use local column >>>> indices >>>> >>>> Do they meet your needs? >>>> >>>> --Junchao Zhang >>>> >>>> >>>> On Tue, Feb 25, 2025 at 5:35 PM Steven Dargaville < >>>> dargaville.ste...@gmail.com> wrote: >>>> >>>>> Thanks for the response! >>>>> >>>>> Although MatSetValuesCOO happens on the device if the input coo_v >>>>> pointer is device memory, I believe MatSetPreallocationCOO requires host >>>>> pointers for coo_i and coo_j, and the preallocation (and construction of >>>>> the COO structures) happens on the host and is then copied onto the >>>>> device? >>>>> I need to be able to create a matrix object with minimal work on the host >>>>> (like many of the routines in aijkok.kokkos.cxx do internally). I >>>>> originally used the COO interface to build the matrices I need, but that >>>>> was around 5x slower than constructing the aij structures myself on the >>>>> device and then just directly using the MatSetSeqAIJKokkosWithCSRMatrix >>>>> type methods. >>>>> >>>>> The reason I thought MatSetSeqAIJKokkosWithCSRMatrix could easily be >>>>> made public is that the Mat_SeqAIJKokkos constructors are already publicly >>>>> accessible? In particular one of those constructors takes in pointers to >>>>> the Kokkos dual views which store a,i,j, and hence one can build a >>>>> sequential matrix with nothing (or very little) occuring on the host. The >>>>> only change I can see that would be necessary is for >>>>> MatSetSeqAIJKokkosWithCSRMatrix (or MatCreateSeqAIJKokkosWithCSRMatrix) to >>>>> be public is to change the PETSC_INTERN to PETSC_EXTERN? >>>>> >>>>> For MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, I believe all >>>>> that is required is declaring the method in the .hpp, as it's already >>>>> defined as static in mpiaijkok.kokkos.cxx. In particular, the comments >>>>> above MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices suggest that the >>>>> off-diagonal block B needs to be built with global column ids, with >>>>> mpiaij->garray constructed on the host along with the rewriting of the >>>>> global column indices in B. This happens in MatSetUpMultiply_MPIAIJ, but >>>>> checking the code there shows that if you pass in a non-null garray to >>>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, the construction and >>>>> compatification is skipped, meaning B can be built with local column ids >>>>> as >>>>> long as garray is provided on the host (which I also build on the device >>>>> and then just copy to the host). Again this is what some of the internal >>>>> Kokkos routines rely on, like the matrix-product. >>>>> >>>>> I am happy to try doing this and submitting a request to the petsc >>>>> gitlab if this seems sensible, I just wanted to double check that I wasn't >>>>> missing something important? >>>>> Thanks >>>>> Steven >>>>> >>>>> On Tue, 25 Feb 2025 at 22:16, Junchao Zhang <junchao.zh...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hi, Steven, >>>>>> MatSetSeqAIJKokkosWithCSRMatrix(Mat A, Mat_SeqAIJKokkos *akok) uses >>>>>> a private data type Mat_SeqAIJKokkos, so it can not be directly made >>>>>> public. >>>>>> If you already use COO, then why not directly make the matrix of >>>>>> type MATAIJKOKKOS and call MatSetPreallocationCOO() and >>>>>> MatSetValuesCOO()? >>>>>> So I am confused by your needs. >>>>>> >>>>>> Thanks! >>>>>> --Junchao Zhang >>>>>> >>>>>> >>>>>> On Tue, Feb 25, 2025 at 3:39 PM Steven Dargaville < >>>>>> dargaville.ste...@gmail.com> wrote: >>>>>> >>>>>>> Hi >>>>>>> >>>>>>> I'm just wondering if there is any possibility of making: >>>>>>> MatSetSeqAIJKokkosWithCSRMatrix >>>>>>> or MatCreateSeqAIJKokkosWithCSRMatrix in >>>>>>> src/mat/impls/aij/seq/kokkos/aijkok.kokkos.cxx >>>>>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices in >>>>>>> src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx >>>>>>> >>>>>>> publicly accessible outside of petsc, or if there is an interface I >>>>>>> have missed for creating Kokkos matrices entirely on the device? >>>>>>> MatCreateSeqAIJKokkosWithCSRMatrix for example is marked as >>>>>>> PETSC_INTERN so >>>>>>> I can't link to it. >>>>>>> >>>>>>> I've currently just copied the code inside of those methods so that >>>>>>> I can build without any preallocation on the host (e.g., through the COO >>>>>>> interface) and it works really well. >>>>>>> >>>>>>> Thanks for your help >>>>>>> Steven >>>>>>> >>>>>> >