[OMPI devel] Retrying a MPI_SEND

2011-11-17 Thread Hugo Daniel Meyer
Hello @ll. I'm doing some changes in the communication framework. Right now i'm working on a "secure" MPI_Send, this send needs to know when an endpoint goes down, and then retry the communication constructing a new endpoint, or at least, overwriting the data of the old endpoint with the new

[OMPI devel] Rename "vader" BTL to "xpmem"

2011-11-17 Thread Jeff Squyres
After having to explain to someone at SC for the umpteenth time this week that the "vader" BTL uses the XPMEM transport under the covers, I'd like to put forth an appeal to rename the "vader" BTL to be "xpmem." Here's my rationale for why: 1. Although we have a history of Star Wars-related

Re: [OMPI devel] Rename "vader" BTL to "xpmem"

2011-11-17 Thread TERRY DONTJE
+1 Isn't there precedent with the other BTLs to name them based on the messaging protocol they are supporting instead of some movie character (tcp, openib, shmem, portals, ...). --td On 11/17/2011 8:11 AM, Jeff Squyres wrote: After having to explain to someone at SC for the umpteenth time

Re: [OMPI devel] Rename "vader" BTL to "xpmem"

2011-11-17 Thread Ralph Castain
Frankly, the only vote that counts is Nathan's - it's his btl, and we have never forcibly made someone rename their component. I would suggest we not set that precedent. I'm comfortable with whatever he decides to call it. On Nov 17, 2011, at 7:00 AM, TERRY DONTJE wrote: > +1 > > Isn't there

Re: [OMPI devel] Rename "vader" BTL to "xpmem"

2011-11-17 Thread TERRY DONTJE
I could possibly buy your argument Ralph if this was a one off BTL that only Nathan (and his employer) is going to use. I am assuming though this is a more general protocol for a vendor specific protocol. Thus it seems that a sane naming of the BTL is within the realm of the community. That

Re: [OMPI devel] Rename "vader" BTL to "xpmem"

2011-11-17 Thread TERRY DONTJE
On 11/17/2011 9:54 AM, Ralph Castain wrote: On Nov 17, 2011, at 7:45 AM, TERRY DONTJE wrote: I could possibly buy your argument Ralph if this was a one off BTL that only Nathan (and his employer) is going to use. I am assuming though this is a more general protocol for a vendor specific

Re: [OMPI devel] [EXTERNAL] Re: Rename "vader" BTL to "xpmem"

2011-11-17 Thread Graham, Richard L.
I have got to say I like the name ... On Nov 17, 2011, at 11:34 AM, Barrett, Brian W wrote: > On 11/17/11 6:29 AM, "Ralph Castain" wrote: > >> Frankly, the only vote that counts is Nathan's - it's his btl, and we >> have never forcibly made someone rename their component. I

[OMPI devel] Fwd: [OMPI svn] svn:open-mpi r25476

2011-11-17 Thread George Bosilca
I guess I reach one of these corner-cases that didn't got tested. I can't start any apps (not even a hostname) after this commit using the rsh PLM (as soon as I add a hostile). The mpirun is blocked in an infinite loop (after it spawned the daemons) in orte_rmaps_base_compute_vpids. Attaching

Re: [OMPI devel] Fwd: [OMPI svn] svn:open-mpi r25476

2011-11-17 Thread Ralph Castain
I'll take a look - I tested that case, and the trunk appears to be working on all the MTT runs. I'll have to see if I can replicate it. On Nov 17, 2011, at 7:42 PM, George Bosilca wrote: > I guess I reach one of these corner-cases that didn't got tested. I can't > start any apps (not even a

Re: [OMPI devel] Fwd: [OMPI svn] svn:open-mpi r25476

2011-11-17 Thread Ralph Castain
Hmmm...well, things seem to work just fine for me: [rhc@odin ~/ompi-hwloc]$ mpirun -np 2 -bynode -mca plm rsh hostname odin090.cs.indiana.edu odin091.cs.indiana.edu [rhc@odin mpi]$ mpirun -np 2 -bynode -mca plm rsh ./hello_nodename Hello, World, I am 1 of 2 on host odin091.cs.indiana.edu from

Re: [OMPI devel] [OMPI svn] svn:open-mpi r25476

2011-11-17 Thread George Bosilca
I have a fresh checkout. In your example where are your hosts coming from? How do you specify the hostile? george. On Nov 17, 2011, at 19:06 , Ralph Castain wrote: > Hmmm...well, things seem to work just fine for me: > > [rhc@odin ~/ompi-hwloc]$ mpirun -np 2 -bynode -mca plm rsh hostname >

Re: [OMPI devel] [OMPI svn] svn:open-mpi r25476

2011-11-17 Thread Ralph Castain
On Nov 17, 2011, at 8:13 PM, George Bosilca wrote: > I have a fresh checkout. In your example where are your hosts coming from? > How do you specify the hostile? The hosts are coming from the slurm allocation, though I also tried adding -host arguments. The error you describe comes well after

Re: [OMPI devel] [OMPI svn] svn:open-mpi r25476

2011-11-17 Thread George Bosilca
Maybe the issue is generated by how the hostile is specified. I used orte_default_hostfile= in my mca-params.conf. george. On Nov 17, 2011, at 19:17 , Ralph Castain wrote: > I'm still building on odin, but will check there again to see if I can > replicate - perhaps something didn't get

Re: [OMPI devel] [OMPI svn] svn:open-mpi r25476

2011-11-17 Thread Ralph Castain
I can't get it to fail, even with hostfile arguments. I'll try again in the morning. On Nov 17, 2011, at 8:49 PM, George Bosilca wrote: > Maybe the issue is generated by how the hostile is specified. I used > orte_default_hostfile= in my mca-params.conf. > > george. > > On Nov 17, 2011, at