On Jul 30, 2012, at 2:37 AM, George Bosilca wrote: > I think that as long as there is a single home area per cluster the > difference between the different approaches might seem irrelevant to most of > the people.
Yeah, I agree - after thinking about it, it probably didn't accomplish much. > > My problem is twofold. First, I have a common home area across several > different development clusters. Thus I have direct access through ssh to any > machine. If I create a single large machinefile, it turns out that every > mpirun will spawn a daemon on every single node, even if I only run a > ping-pong test. That shouldn't happen if you specify the hosts you want to use, either via -host or -hostfile. I assume you are specifying nothing and so you get that behavior? > Second, while I usually run my apps on the same set of resources I need on a > regular base to switch my nodes for few tests. > > What I was hoping to achieve is a machinefile containing the "default" > development cluster (aka. the cluster where I'm almost alone so my deamons > have minimal chances to disturb other people experiences), and then use a > machinefile to sporadicly change the cluster where I run for smaller tests. > Unfortunately, this doesn't work due to the filtering behavior described in > my original email. Why not just set the default hostfile to point to the new machinefile via the "--default-hostfile foo" option to mpirun, or you can use the corresponding MCA param? I'm not trying to re-open the hostfile discussion, but I would be interested to hear how you feel -hostfile should work. I kinda gather you feel it should override the default hostfile instead of filter it, yes? My point being that I don't particularly know if anyone would disagree with that behavior, so we might decide to modify things if you want to propose it. Ralph > > george. > > > On Jul 28, 2012, at 19:24 , Ralph Castain wrote: > >> It's been awhile, but I vaguely remember the discussion. IIRC, the rationale >> was that the default hostfile was equivalent to an RM allocation and should >> be treated the same. So hostfile and -host become filters in that case. >> >> FWIW, I believe the discussion was split on that question. I added a "none" >> option to the default hostfile MCA param so it would be ignored in the case >> where (a) the sys admin has given a default hostfile, but (b) someone wants >> to use hosts outside of it. >> >> MCA orte: parameter "orte_default_hostfile" (current value: >> <none>, data source: default value) >> Name of the default hostfile (relative or absolute >> path, "none" to ignore environmental or default MCA param setting) >> >> That said, I can see a use-case argument for behaving somewhat differently. >> We've even had cases where users have gotten an allocation from an RM, but >> want to add hosts that are external to the cluster to the job. >> >> It would be rather trivial to modify the logic: >> >> 1. read the default hostfile or RM allocation for our baseline >> >> 2. remove any hosts on that list that are *not* in the given hostfile >> >> 3. add any hosts that are in the given hostfile, but weren't in the default >> hostfile >> >> And subsequently do the same for -host. I think that would retain the spirit >> of the discussion, but provide more flexibility and provide a tad more >> "expected" behavior. >> >> I don't have an iron in this fire as I don't use hostfiles, so I'm happy to >> implement whatever the community would like to see. >> Ralph >> >> On Jul 27, 2012, at 6:30 PM, George Bosilca wrote: >> >>> I'm somewhat puzzled by the behavior of the -hostfile in Open MPI. Based on >>> the FAQ it is supposed to provide a list of resources to be used by the >>> launcher (in my case ssh) to start the processes. Make sense so far. >>> >>> However, if the configuration file contain a value for >>> orte_default_hostfile, then the behavior of the hostfile option change >>> drastically, and the option become a filter (the machines must be on the >>> original list or a cryptic error message is displayed). >>> >>> Overall, we have a well defined [mostly] consistent behavior for parameters >>> in Open MPI. We have an order of precedence of sources of MCA parameters, >>> clearly defined which make understanding where a value comes >>> straightforward. I'm absolutely certain there was a group discussion about >>> this unique "eccentricity" regarding the hostfile option, but I fail to >>> remember what was the reason we decided to go this way. Can I have a quick >>> refresh please? >>> >>> Thanks, >>> george. >>> >>> >>> _______________________________________________ >>> devel mailing list >>> [email protected] >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> >> _______________________________________________ >> devel mailing list >> [email protected] >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > _______________________________________________ > devel mailing list > [email protected] > http://www.open-mpi.org/mailman/listinfo.cgi/devel
