On 2 Jun 2010, at 16:49, Jeff Squyres wrote: > On Jun 2, 2010, at 11:29 AM, Sylvain Jeaugey wrote: > >> But it made me progress on why I'm crashing : in my case, only a subset of >> processes have their create_cq fail. > > Ah, this is the key. If I have one process (out of many) fail the > create_cq() function, I get a segv during finalize. I'll dig.
Is there an assumption that if process A claims to be able to communicate with process B that process B can also communicate with process A. It almost sounds like the code needs to do a allreduce on the bitmask returned by the btls. Ashley, -- Ashley Pittman, Bath, UK. Padb - A parallel job inspection tool for cluster computing http://padb.pittman.org.uk