Re: [OMPI devel] Change in hostfile behavior

2008-07-29 Thread Lenny Verkhovsky
for two separate runs we can use slot_list parameter ( opal_paffinity_base_slot_list ) to have paffinity 1: mpirun -mca opal_paffinity_base_slot_list "0-1" 2 :mpirun -mca opal_paffinity_base_slot_list "2-3" On 7/28/08, Ralph Castain wrote: > > Actually, this is true today regardless of this cha

Re: [OMPI devel] trunk hangs since r19010

2008-07-29 Thread Pavel Shamis (Pasha)
Jeff Squyres wrote: This used to be true, but I think we changed it a while ago (Pasha: do you remember?) because Mellanox HCAs are capable of send-to-self (process) and there were no code changes necessary to enable it. So it allowed a slightly simpler command line. This was quite a while

Re: [OMPI devel] Change in hostfile behavior

2008-07-29 Thread Ralph Castain
Lenny's point is true - except for the danger of setting that mca param and its possible impact on ORTE daemons+mpirun - see other note in that regard. However, it would only be useful if the same user was doing it. I believe Tim was concerned about the case where two users are sharing no

Re: [OMPI devel] Change in hostfile behavior

2008-07-29 Thread Jeff Squyres
On Jul 29, 2008, at 8:43 AM, Ralph Castain wrote: Lenny's point is true - except for the danger of setting that mca param and its possible impact on ORTE daemons+mpirun - see other note in that regard. However, it would only be useful if the same user was doing it. I believe Tim was conce

Re: [OMPI devel] trunk hangs since r19010

2008-07-29 Thread George Bosilca
I ran few tests and the only combination leading to a deadlock is openib and self. As openib is the only BTL supporting self communications (except self of course), I guess it interfere with self in some more or less strange ways. I didn't had the time to dig deeper yet to see what exactly

Re: [OMPI devel] trunk hangs since r19010

2008-07-29 Thread Jeff Squyres
Ok. FWIW, Pasha and I think that openib has supported "send-to-self" for a while (we don't know exactly when; but Pasha thinks it is very old code that we don't check for self in add_procs). But it only broke recently. On Jul 29, 2008, at 9:31 AM, George Bosilca wrote: I ran few tests a

Re: [OMPI devel] trunk hangs since r19010

2008-07-29 Thread Jeff Squyres
On Jul 29, 2008, at 9:47 AM, Jeff Squyres wrote: Ok. FWIW, Pasha and I think that openib has supported "send-to- self" for a while (we don't know exactly when; but Pasha thinks it is very old code that we don't check for self in add_procs). But it only broke recently. More in the FWIW c

[OMPI devel] ticket #972

2008-07-29 Thread Terry Dontje
So, we've pinged ticket #972 several times to see if the issue it covers has been fixed and have not really gotten a response in the last few months. While talking with Jeff about a recent thread on the users list about this issue he has found the code in btl_tcp_proc.c that determines whether

[OMPI devel] TCP BTL routability (was: ticket #972)

2008-07-29 Thread Jeff Squyres
On Jul 29, 2008, at 3:20 PM, Terry Dontje wrote: So, we've pinged ticket #972 several times to see if the issue it covers has been fixed and have not really gotten a response in the last few months. While talking with Jeff about a recent thread on the users list about this issue he has fou

Re: [OMPI devel] TCP BTL routability (was: ticket #972)

2008-07-29 Thread Adrian Knoth
On Tue, Jul 29, 2008 at 03:25:00PM -0400, Jeff Squyres wrote: > For reference, the FAQ entry is here: > > http://www.open-mpi.org/faq/?category=tcp#tcp-routability > > It looks like we now *always* assume that two TCP peers are routable. As long as they share the same address family (IPv