I would be happy to help troubleshoot, but I am not much of a programmer to know how. The hang is reproducible, and -mca btl ^sm is about 15% faster.
if you want to shoot me some instructions off list, I can give it a go. The application that I am working with, primarily, is ABySS: http://www.bcgsc.ca/platform/bioinfo/software/abyss Matt On Dec 15, 2009, at 11:55 AM, Eugene Loh wrote: > Matthew MacManes wrote: > >> On my system, mpirun -np 8 -mca btl_sm_num_fifos 7 is much slower (and >> appeared to hang after several thousand interations) than -mca btl ^sm >> > If the hang is reproducible, we should perhaps have a look. Also, the fact > that it's much slower is interesting. Can you characterize the message > pattern? Increasing the number of FIFOs means that there are more places to > look to find messages, but this should make a difference mainly only for very > large on-node process counts (more than 8 I would have thought) and very > latency-sensitive applications (but perhaps that's what you have). > >> Is there another better way I should be modifying fifos to get better >> performance? >> > Actually, there have some been some promising developments on the trac-2043 > front. So, maybe 1-3 days of patience could payoff here. But, I'm not in a > position to promise anything. > >> On Dec 11, 2009, at 4:04 AM, Terry Dontje wrote: >> >>>> Date: Thu, 10 Dec 2009 17:57:27 -0500 >>>> From: Jeff Squyres <jsquy...@cisco.com> >>>> >>>> On Dec 10, 2009, at 5:53 PM, Gus Correa wrote: >>>> >>>>>> How does the efficiency of loopback >>>>>> (let's say, over TCP and over IB) compare with "sm"? >>>> Definitely not as good; that's why we have sm. :-) I don't have any >>>> quantification of that assertion, though (i.e., no numbers to back that >>>> up). >>>> >>> However, as Eugene wrote earlier you can actually increase the number of >>> fifos used by the SM and avoid the hang that way. Unless you are really >>> strapped for memory I think that would be the best way to go. >>> > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users _________________________________ Matthew MacManes PhD Candidate University of California- Berkeley Museum of Vertebrate Zoology Phone: 510-495-5833 Lab Website: http://ib.berkeley.edu/labs/lacey Personal Website: http://macmanes.com/