Hi, Chang, For the mumps solver, we usually transfers matrix and vector data within a compute node. For the idea you propose, it looks like we need to gather data within MPI_COMM_WORLD, right?
Mark, I remember you said cusparse solve is slow and you would rather do it on CPU. Is it right? --Junchao Zhang On Mon, Oct 11, 2021 at 10:25 PM Chang Liu via petsc-users < petsc-users@mcs.anl.gov> wrote: > Hi, > > Currently, it is possible to use mumps solver in PETSC with > -mat_mumps_use_omp_threads option, so that multiple MPI processes will > transfer the matrix and rhs data to the master rank, and then master > rank will call mumps with OpenMP to solve the matrix. > > I wonder if someone can develop similar option for cusparse solver. > Right now, this solver does not work with mpiaijcusparse. I think a > possible workaround is to transfer all the matrix data to one MPI > process, and then upload the data to GPU to solve. In this way, one can > use cusparse solver for a MPI program. > > Chang > -- > Chang Liu > Staff Research Physicist > +1 609 243 3438 > c...@pppl.gov > Princeton Plasma Physics Laboratory > 100 Stellarator Rd, Princeton NJ 08540, USA >