Hi Pierre,

> I have a cluster with nodes of 2 sockets of 4 cores+1 GPU.

Is there a way to run a calculation with 4*N MPI tasks where
my matrix is first built outside PETSc, then to solve the
linear system using PETSc Mat, Vec, KSP on only N MPI
tasks to adress efficiently the N GPUs ?

as far as I can tell, this should be possible with a suitable subcommunicator. The tricky piece, however, is to select the right MPI ranks for this. Note that you generally have no guarantee on how the MPI ranks are distributed across the nodes, so be prepared for something fairly specific to your MPI installation.


I am playing with the communicators without success, but I
am surely confusing things...

To keep matters simple, try to get this scenario working with a purely CPU-based solve. Once this works, the switch to GPUs should be just a matter of passing the right flags. Have a look at PetscInitialize() here:
http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscInitialize.html
which mentions that you need to create the subcommunicator of MPI_COMM_WORLD first.

Best regards,
Karli

Reply via email to