r<#secure method=pgpmime mode=sign> Hong <hzh...@mcs.anl.gov> writes:
> I plan to implement MatTransposeMatMult_MPIDense_MPIDense via > > 1. add MatTransposeMatMult_elemental_elemental() > 2. C_dense = P_dense^T * B_dense > via MatConvert_dense_elemental() and MatConvert_elemental_dense() The above involves a ton of data movement and MPIDense is a logical distribution for matrices with a modest number of columns. I think I would just do the local GEMM and then MPI_Reduce_scatter it.