Gilbert Grosdidier wrote:
Any other suggestion ?
Can any more information be extracted from profiling?  Here is where I think things left off:

Eugene Loh wrote:
Gilbert Grosdidier wrote:
#                             [time]       [calls]        <%mpi>      <%wall>
# MPI_Waitall                 741683   7.91081e+07         77.96        21.58
# MPI_Allreduce               114057   2.53665e+07         11.99         3.32
# MPI_Isend                  27420.6   6.53513e+08          2.88         0.80
# MPI_Irecv                  464.616   6.53513e+08          0.05         0.01
###############################################################################


It seems to my non-expert eye that MPI_Waitall is dominant among MPI calls,
but not for the overall application,
Looks like on average each MPI_Waitall call is completing 8+ MPI_Isend calls and 8+ MPI_Irecv calls.  I think IPM gives some point-to-point messaging information.  Maybe you can tell what the distribution is of message sizes, etc.  Or, maybe you already know the characteristic pattern.  Does a stand-alone message-passing test (without the computational portion) capture the performance problem you're looking for?
Do you know message lengths and patterns?  Can you confirm whether non-MPI time is the same between good and bad runs?

Reply via email to