On Fri, Oct 05, 2007 at 09:43:44AM +0200, Jeff Squyres wrote: > David -- > > Gleb and I just actively re-looked at this problem yesterday; we > think it's related to https://svn.open-mpi.org/trac/ompi/ticket/ > 1015. We previously thought this ticket was a different problem, but > our analysis yesterday shows that it could be a real problem in the > openib BTL or ob1 PML (kinda think it's the openib btl because it > doesn't seem to happen on other networks, but who knows...). > > Gleb is investigating. Here is the result of the investigation. The problem is different than #1015 ticket. What we have here is one rank calls isend() of a small message and wait_all() in a loop and another one calls irecv(). The problem is that isend() usually doesn't call opal_progress() anywhere and wait_all() doesn't call progress if all requests are already completed so messages are never progressed. We may force opal_progress() to be called by setting btl_openib_free_list_max to 1000. Then wait_all() will call progress because not every request will be immediately completed by OB1. Or we can limit a number of uncompleted requests that OB1 can allocate by setting pml_ob1_free_list_max to 1000. Then opal_progress() will be called from a free_list_wait() when max will be reached. The second option works much faster for me.
> > > > On Oct 5, 2007, at 12:59 AM, David Daniel wrote: > > > Hi Folks, > > > > I have been seeing some nasty behaviour in collectives, > > particularly bcast and reduce. Attached is a reproducer (for bcast). > > > > The code will rapidly slow to a crawl (usually interpreted as a > > hang in real applications) and sometimes gets killed with sigbus or > > sigterm. > > > > I see this with > > > > openmpi-1.2.3 or openmpi-1.2.4 > > ofed 1.2 > > linux 2.6.19 + patches > > gcc (GCC) 3.4.5 20051201 (Red Hat 3.4.5-2) > > 4 socket, dual core opterons > > > > run as > > > > mpirun --mca btl self,openib --npernode 1 --np 4 bcast-hang > > > > To my now uneducated eye it looks as if the root process is rushing > > ahead and not progressing earlier bcasts. > > > > Anyone else seeing similar? Any ideas for workarounds? > > > > As a point of reference, mvapich2 0.9.8 works fine. > > > > Thanks, David > > > > > > <bcast-hang.c> > > _______________________________________________ > > devel mailing list > > de...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > -- > Jeff Squyres > Cisco Systems > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Gleb.