I have a fresh checkout. In your example where are your hosts coming from? How do you specify the hostile?
george. On Nov 17, 2011, at 19:06 , Ralph Castain wrote: > Hmmm...well, things seem to work just fine for me: > > [rhc@odin ~/ompi-hwloc]$ mpirun -np 2 -bynode -mca plm rsh hostname > odin090.cs.indiana.edu > odin091.cs.indiana.edu > > [rhc@odin mpi]$ mpirun -np 2 -bynode -mca plm rsh ./hello_nodename > Hello, World, I am 1 of 2 on host odin091.cs.indiana.edu from app number 0 > universe size 8 > Hello, World, I am 0 of 2 on host odin090.cs.indiana.edu from app number 0 > universe size 8 > > > I'll get a fresh checkout and see if I can replicate from that... > > On Nov 17, 2011, at 7:42 PM, George Bosilca wrote: > >> I guess I reach one of these corner-cases that didn't got tested. I can't >> start any apps (not even a hostname) after this commit using the rsh PLM (as >> soon as I add a hostile). The mpirun is blocked in an infinite loop (after >> it spawned the daemons) in orte_rmaps_base_compute_vpids. Attaching with gdb >> indicates that cnt is never incremented, thus the mpirun is stuck forever in >> the while loop at line 397. >> >> I used "mpirun -np 2 --bynode ./tp_lb_ub_ng" to start my application, and I >> have a machine file containing two nodes: >> >> node01 slots=8 >> node02 slots=8 >> >> In addition CTRL+C seems to be broken … >> >> george. >> >> Begin forwarded message: >> >>> Author: rhc >>> Date: 2011-11-14 22:40:11 EST (Mon, 14 Nov 2011) >>> New Revision: 25476 >>> URL: https://svn.open-mpi.org/trac/ompi/changeset/25476 >>> >>> Log: >>> At long last, the fabled revision to the affinity system has arrived. A >>> more detailed explanation of how this all works will be presented here: >>> >>> https://svn.open-mpi.org/trac/ompi/wiki/ProcessPlacement >>> >>> The wiki page is incomplete at the moment, but I hope to complete it over >>> the next few days. I will provide updates on the devel list. As the wiki >>> page states, the default and most commonly used options remain unchanged >>> (except as noted below). New, esoteric and complex options have been added, >>> but unless you are a true masochist, you are unlikely to use many of them >>> beyond perhaps an initial curiosity-motivated experimentation. >>> >>> In a nutshell, this commit revamps the map/rank/bind procedure to take into >>> account topology info on the compute nodes. I have, for the most part, >>> preserved the default behaviors, with three notable exceptions: >>> >>> 1. I have at long last bowed my head in submission to the system admin's of >>> managed clusters. For years, they have complained about our default of >>> allowing users to oversubscribe nodes - i.e., to run more processes on a >>> node than allocated slots. Accordingly, I have modified the default >>> behavior: if you are running off of hostfile/dash-host allocated nodes, >>> then the default is to allow oversubscription. If you are running off of >>> RM-allocated nodes, then the default is to NOT allow oversubscription. >>> Flags to override these behaviors are provided, so this only affects the >>> default behavior. >>> >>> 2. both cpus/rank and stride have been removed. The latter was demanded by >>> those who didn't understand the purpose behind it - and I agreed as the >>> users who requested it are no longer using it. The former was removed >>> temporarily pending implementation. >>> >>> 3. vm launch is now the sole method for starting OMPI. It was just too >>> darned hard to maintain multiple launch procedures - maybe someday, >>> provided someone can demonstrate a reason to do so. >>> >>> As Jeff stated, it is impossible to fully test a change of this size. I >>> have tested it on Linux and Mac, covering all the default and simple >>> options, singletons, and comm_spawn. That said, I'm sure others will find >>> problems, so I'll be watching MTT results until this stabilizes. >> >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel