I added a workaround to avoid collective calls in VecDuplicate. With 216 cores, it improved the code's performance by 5%. I can imagine more improvement with more cores. But more importantly, the "Reduction count" of MatSOR in -log_view is zero now instead of some misleading number.
BTW, it does not solve the imbalance problem of MatSOR which I am debugging. --Junchao Zhang On Tue, Jun 19, 2018 at 10:58 PM, Smith, Barry F. <[email protected]> wrote: > > Ahh yes. Since the layout can get large we wanted to avoid each vector > having its own copy. > > But it seems sloppy since DMCreateGlobalVector_DA() does keep a new > copy for each vector. > > We could revisit the entire VecDuplicate/DMCreateGlobalVector() > infrastructure for all the vector types including the DM ones to > > 1) prevent collective calls > 2) reference the layout instead of making new ones. > > > Barry > > It looks like VecDuplicate_MPI() does the right thing. > > > > On Jun 19, 2018, at 10:35 PM, Jed Brown <[email protected]> wrote: > > > > "Smith, Barry F." <[email protected]> writes: > > > >> Jed, > >> > >> You added these two lines. Any idea why they are needed? > > > > That was just replacing the old interface > > > > https://bitbucket.org/petsc/petsc/commits/077aedaf8cb5b060ad5dab7a12035d > af9d64dd75#Lsrc/dm/impls/da/dadist.cF19 > > > > that you added here > > > > https://bitbucket.org/petsc/petsc/commits/5ead1ec26b293130cbc6f8fa84f0a9 > ffce38a0bf > > > >> Thanks > >> > >> Barry > >> > >> 3e08d2bebf99 src/dm/impls/da/dadist.c (Barry Smith 2012-09-18 > 14) PetscFunctionBegin; > >> c688c0463f6c src/dm/impls/da/dadist.c (Matthew Knepley 2012-11-11 > 15) ierr = VecGetDM(g, &da);CHKERRQ(ierr); > >> 2dcb2ebcaafe src/dm/impls/da/dadist.c (Barry Smith 2010-10-31 > 16) ierr = DMCreateGlobalVector(da,gg);CHKERRQ(ierr); > >> 077aedaf8cb5 src/dm/impls/da/dadist.c (Jed Brown 2013-02-11 > 17) ierr = VecGetLayout(g,&map);CHKERRQ(ierr); > >> 077aedaf8cb5 src/dm/impls/da/dadist.c (Jed Brown 2013-02-11 > 18) ierr = VecSetLayout(*gg,map);CHKERRQ(ierr); > >> 2dcb2ebcaafe src/dm/impls/da/dadist.c (Barry Smith 2010-10-31 > 19) PetscFunctionReturn(0) > >> > >> > >> If we were a bit more clever we could probably avoid all > communication in VecDuplicate() but is it worth bothering unless we know it > is problematic > >> > >>> On Jun 19, 2018, at 4:30 PM, Junchao Zhang <[email protected]> > wrote: > >>> > >>> I met an expensive VecDuplicate. See the attached call stack. > VecDuplicate_MPI_DA indirectly calls MPI_Allreduce/Allgather to build the > vector's layout, but it then immediately destroys it with VecGetLayout & > VecSetLayout. Is it wrong? > >>> src/dm/impls/da/dadist.c > >>> PetscErrorCode VecDuplicate_MPI_DA(Vec g,Vec *gg) > >>> { > >>> PetscErrorCode ierr; > >>> DM da; > >>> PetscLayout map; > >>> > >>> PetscFunctionBegin; > >>> ierr = VecGetDM(g, &da);CHKERRQ(ierr); > >>> ierr = DMCreateGlobalVector(da,gg);CHKERRQ(ierr); > >>> ierr = VecGetLayout(g,&map);CHKERRQ(ierr); > >>> ierr = VecSetLayout(*gg,map);CHKERRQ(ierr); > >>> PetscFunctionReturn(0); > >>> } > >>> > >>> --Junchao Zhang > >>> <VecDup.png> > >
