Ashley Pittman wrote:
Do you have a stack trace of your hung application to hand, in
particular when you say "All
processes have made the same call to MPI_Allreduce. The processes are
all in opal_progress, called (with intervening calls) by MPI_Allreduce."
do the intervening calls include mca_coll_sync_bcast
ompi_coll_tuned_barrier_intra_dec_fixed and
ompi_coll_tuned_barrier_intra_recursivedoubling?
I don't have a stack trace handy, and today is pretty full. I'll try
and make some time to document what I've got in the next few days. I
was able to hang a C translation of Ralph's reproducer as well.
- Bryan
--
Bryan Lally, la...@lanl.gov
505.667.9954
CCS-2
Los Alamos National Laboratory
Los Alamos, New Mexico