Re: [petsc-dev] Parallel calculation on GPU

Projet_TRIOU Wed, 20 Aug 2014 03:42:07 -0700

On 08/20/14 12:11, Karl Rupp wrote:

Hi Pierre,
> I have a cluster with nodes of 2 sockets of 4 cores+1 GPU.
Is there a way to run a calculation with 4*N MPI tasks where
my matrix is first built outside PETSc, then to solve the
linear system using PETSc Mat, Vec, KSP on only N MPI
tasks to adress efficiently the N GPUs ?
as far as I can tell, this should be possible with a suitablesubcommunicator. The tricky piece, however, is to select the right MPIranks for this. Note that you generally have no guarantee on how theMPI ranks are distributed across the nodes, so be prepared forsomething fairly specific to your MPI installation.

Yes, I am ready to face this point too.

I am playing with the communicators without success, but I
am surely confusing things...
To keep matters simple, try to get this scenario working with a purelyCPU-based solve. Once this works, the switch to GPUs should be just amatter of passing the right flags. Have a look at PetscInitialize() here:http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscInitialize.htmlwhich mentions that you need to create the subcommunicator ofMPI_COMM_WORLD first.

I also started the work with a purely CPU-based solve only to test, butwithout success. When

I read this:

"If you wish PETSc code to run ONLY on a subcommunicator ofMPI_COMM_WORLD, create that communicator first and assign it toPETSC_COMM_WORLD<http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PETSC_COMM_WORLD.html#PETSC_COMM_WORLD>BEFORE calling PetscInitialize<http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscInitialize.html#PetscInitialize>().Thus if you are running a four process job and two processes will runPETSc and have PetscInitialize<http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscInitialize.html#PetscInitialize>()and PetscFinalize<http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscFinalize.html#PetscFinalize>()and two process will not, then do this. If ALL processes inthe job are using PetscInitialize<http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscInitialize.html#PetscInitialize>()and PetscFinalize<http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscFinalize.html#PetscFinalize>()then you don't need to do this, even if different subcommunicators ofthe job are doing different things with PETSc."

I think I am not in this special scenario, because as my matrix isinitially partitionned on 4processes, I need to call PetscInitialize() on each 4 processes in orderto build the PETSc matrixwith MatSetValues. And my goal is after to solve the linear system ononly 2 processes... So

building a sub-communicator will really do the trick ? Or i miss something ?

Thanks Karli for your answer,

Pierre

Best regards,
Karli



--
*Trio_U support team*
Marthe ROUX (01 69 08 00 02) Saclay
Pierre LEDAC (04 38 78 91 49) Grenoble

Re: [petsc-dev] Parallel calculation on GPU

Reply via email to