Thanks Hong, that sounds great. I am weary of silent optimizations like you suggest but 2x is big! and failure is very bad. So I would vote for your suggestion.
ex56 is an elasticity problem. It would be nice, now that you have this experimental setup in place, to compare with hypre on a 3D scalar problem. Hypre might not spend much effort optimizing for block matrices. 3x better than hypre seems large to me. I have to suspect that hypre does not exploit blocked matrices as much as we do. Thanks again, On Wed, Feb 8, 2017 at 12:07 PM, Hong <[email protected]> wrote: > I conducted tests on MatMatMult() and MatPtAP() using > petsc/src/ksp/ksp/examples/tutorials/ex56.c > (gamg) on a 8-core machine (petsc machine). The output file is attached. > > Summary: > 1) non-scalable MatMatMult() for mpiaij format is 2x faster than scalable > version. The major difference between the two is dense-axpy vs. sparse-axpy. > > Currently, we set non-scalable as default, which leads to problem when > running large problems. > How about setting default as > - non-scalable for small to medium size matrices > - scalable for larger ones, e.g. > > + ierr = PetscOptionsEList("-matmatmult_via","Algorithmic > approach","MatMatMult",algTypes,nalg,algTypes[1],&alg,&flg); > > + if (!flg) { /* set default algorithm based on B->cmap->N */ > + PetscMPIInt size; > + ierr = MPI_Comm_size(comm,&size);CHKERRQ(ierr); > + if ((PetscReal)(B->cmap->N)/size > 100000.0) alg = 0; /* scalable > algorithm */ > + } > > i.e., if user does NOT pick an algorithm, when ave cols per process > > 100k, use scalable implementation; otherwise, non-scalable version. > > 2) We do NOT have scalable implementation for MatPtAP() yet. > We have non-scalable PtAP and interface to Hypre's PtAP. Comparing the > two, > Petsc MatPtAP() is approx 3x faster than Hypre's. > > I'm writing a scalable MatPtAP() now. > > Hong > > On Thu, Feb 2, 2017 at 2:54 PM, Stefano Zampini <[email protected] > > wrote: > >> >> >> Il 02 Feb 2017 23:43, "Mark Adams" <[email protected]> ha scritto: >> >> >> >> On Thu, Feb 2, 2017 at 12:02 PM, Stefano Zampini < >> [email protected]> wrote: >> >>> Mark, >>> >>> I saw your configuration has hypre. If you could run with master, you >>> may try -matptap_via hypre. >>> >> >> This is worth trying. Does this even work with GAMG? >> >> >> Yes, it should work, except that the block sizes, if any, are not >> propagated to the resulting matrix. I can add it if you need it. >> >> >> >> Treb: try hypre anyway. It has its own RAP code. >> >> >> >> With that option, you will use hypre's RAP with MATAIJ >> >> >> It uses BoomerAMGBuildCoarseOperator directly with the AIJ matrices. >>> >>> Stefano >>> >>> On Feb 2, 2017, at 7:28 PM, Mark Adams <[email protected]> wrote: >>> >>> >>> >>> On Thu, Feb 2, 2017 at 11:13 AM, Hong <[email protected]> wrote: >>> >>>> Mark: >>>> Try '-matmatmult_via scalable' first. If this works, should we set it >>>> as default? >>>> >>> >>> If it is robust I would say yes unless it is noticeably slower (say >>> >20%) small scale problems. >>> >>> >>> >> >> >
