Jed: Impressive! Do you ever sleep? Hong
I added proper preallocation for MatTranspose_MPIAIJ(), which speeds it up > greatly. > > > https://bitbucket.org/petsc/petsc-dev/changeset/486d000050ec62fbd732c0049cb5f09b2b5709b8 > > https://bitbucket.org/petsc/petsc-dev/changeset/75fca7ed1efa754ca010596a8ba69319501baf52(oops) > > Testing on cg > $ mpirun -n 64 ./ex56 -pc_type gamg -ksp_monitor -ksp_rtol 1e-1 > -log_summary -mattransposematmult_viamatmatmult 1 > > *Before:* > -ne 99 > MatTranspose 3 1.0 *1.3230e+00* 1.0 0.00e+00 0.0 1.0e+04 > 2.7e+03 5.1e+01 17 0 3 2 4 33 0 6 7 4 0 > MatTrnMatMult 3 1.0 1.8360e+00 1.0 2.26e+07 1.1 2.3e+04 6.0e+03 > 1.2e+02 24 2 6 12 9 46 10 13 35 10 765 > -ne 119 > MatTranspose 3 1.0 *2.3402e+00* 1.0 0.00e+00 0.0 1.3e+04 > 3.1e+03 5.1e+01 16 0 3 2 4 34 0 6 7 4 0 > MatTrnMatMult 3 1.0 3.2240e+00 1.0 3.91e+07 1.1 2.8e+04 6.9e+03 > 1.2e+02 23 2 6 12 9 46 10 13 35 10 759 > > *After:* > -ne 99 > MatTranspose 3 1.0 *9.5813e-02* 1.0 0.00e+00 0.0 1.0e+04 > 2.7e+03 4.8e+01 1 0 3 2 4 3 0 6 7 4 0 > MatTrnMatMult 3 1.0 6.0673e-01 1.0 2.26e+07 1.1 2.3e+04 6.0e+03 > 1.2e+02 8 2 6 12 9 21 10 13 35 10 2316 > -ne 119 > MatTranspose 3 1.0 *1.8572e-01* 1.0 0.00e+00 0.0 1.3e+04 > 3.1e+03 4.8e+01 2 0 3 2 4 4 0 6 7 4 0 > MatTrnMatMult 3 1.0 1.0656e+00 1.0 3.91e+07 1.1 2.8e+04 6.9e+03 > 1.2e+02 10 2 6 12 9 23 10 13 35 10 2297 > > *Reference* (-mattransposematmult_viamatmatmult 0): > -ne 99 > MatTrnMatMult 3 1.0 8.0196e-01 1.0 1.02e+08 1.1 1.3e+04 1.3e+04 > 8.7e+01 13 10 4 15 7 28 33 8 40 7 7831 > -ne 119 > MatTrnMatMult 3 1.0 1.3759e+00 1.0 1.78e+08 1.1 1.6e+04 1.6e+04 > 8.7e+01 12 10 4 15 7 27 33 8 40 8 7999 > > I don't know why the reference implementation claims to have done so many > more flops. > > This indicates that perhaps it makes sense for MatPtAP to do an explicit > transpose and then RAP. Unless we can find a fast data structure for A^T * > B. > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20121014/2972a219/attachment.html>
