PETSc uses Elemental to perform such operations.
PetscErrorCode MatMatMultNumeric_Elemental(Mat A,Mat B,Mat C) { Mat_Elemental *a = (Mat_Elemental*)A->data; Mat_Elemental *b = (Mat_Elemental*)B->data; Mat_Elemental *c = (Mat_Elemental*)C->data; PetscElemScalar one = 1,zero = 0; PetscFunctionBegin; { /* Scoping so that constructor is called before pointer is returned */ El::Gemm(El::NORMAL,El::NORMAL,one,*a->emat,*b->emat,zero,*c->emat); } C->assembled = PETSC_TRUE; PetscFunctionReturn(0); } You can consult Elemental's documentation and papers for how it manages the communication. Barry > On Dec 1, 2021, at 8:33 AM, Hannes Brandt <hannes_bra...@gmx.de> wrote: > > Hello, > > > I am interested in the communication scheme Petsc uses for the multiplication > of dense, parallel distributed matrices in MatMatMult. Is it based on > collective communication or on single calls to MPI_Send/Recv, and is it done > in a blocking or a non-blocking way? How do you make sure that the processes > do not receive/buffer too much data at the same time? > > > Best Regards, > > Hannes Brandt > >