On Jul 30, 2012, at 15:29 , Ralph Castain wrote: > > On Jul 30, 2012, at 2:37 AM, George Bosilca wrote: > >> I think that as long as there is a single home area per cluster the >> difference between the different approaches might seem irrelevant to most of >> the people. > > Yeah, I agree - after thinking about it, it probably didn't accomplish much. > >> >> My problem is twofold. First, I have a common home area across several >> different development clusters. Thus I have direct access through ssh to any >> machine. If I create a single large machinefile, it turns out that every >> mpirun will spawn a daemon on every single node, even if I only run a >> ping-pong test. > > That shouldn't happen if you specify the hosts you want to use, either via > -host or -hostfile. I assume you are specifying nothing and so you get that > behavior? > >> Second, while I usually run my apps on the same set of resources I need on a >> regular base to switch my nodes for few tests. >> >> What I was hoping to achieve is a machinefile containing the "default" >> development cluster (aka. the cluster where I'm almost alone so my deamons >> have minimal chances to disturb other people experiences), and then use a >> machinefile to sporadicly change the cluster where I run for smaller tests. >> Unfortunately, this doesn't work due to the filtering behavior described in >> my original email. > > Why not just set the default hostfile to point to the new machinefile via the > "--default-hostfile foo" option to mpirun, or you can use the corresponding > MCA param?
I confirm, if instead of -machinefile I use --default-hostfile I get the behavior I expected (it overwrites the default). > I'm not trying to re-open the hostfile discussion, but I would be interested > to hear how you feel -hostfile should work. I kinda gather you feel it should > override the default hostfile instead of filter it, yes? My point being that > I don't particularly know if anyone would disagree with that behavior, so we > might decide to modify things if you want to propose it. Right, I would have expected to work in the same way as almost all the other MCA parameters, by overwriting the less variants with less priority. But I don't mind typing "--default-hostfile" instead of "-machinefile" to get the behavior I like. george. > > Ralph > > >> >> george. >> >> >> On Jul 28, 2012, at 19:24 , Ralph Castain wrote: >> >>> It's been awhile, but I vaguely remember the discussion. IIRC, the >>> rationale was that the default hostfile was equivalent to an RM allocation >>> and should be treated the same. So hostfile and -host become filters in >>> that case. >>> >>> FWIW, I believe the discussion was split on that question. I added a "none" >>> option to the default hostfile MCA param so it would be ignored in the case >>> where (a) the sys admin has given a default hostfile, but (b) someone wants >>> to use hosts outside of it. >>> >>> MCA orte: parameter "orte_default_hostfile" (current value: >>> <none>, data source: default value) >>> Name of the default hostfile (relative or absolute >>> path, "none" to ignore environmental or default MCA param setting) >>> >>> That said, I can see a use-case argument for behaving somewhat differently. >>> We've even had cases where users have gotten an allocation from an RM, but >>> want to add hosts that are external to the cluster to the job. >>> >>> It would be rather trivial to modify the logic: >>> >>> 1. read the default hostfile or RM allocation for our baseline >>> >>> 2. remove any hosts on that list that are *not* in the given hostfile >>> >>> 3. add any hosts that are in the given hostfile, but weren't in the default >>> hostfile >>> >>> And subsequently do the same for -host. I think that would retain the >>> spirit of the discussion, but provide more flexibility and provide a tad >>> more "expected" behavior. >>> >>> I don't have an iron in this fire as I don't use hostfiles, so I'm happy to >>> implement whatever the community would like to see. >>> Ralph >>> >>> On Jul 27, 2012, at 6:30 PM, George Bosilca wrote: >>> >>>> I'm somewhat puzzled by the behavior of the -hostfile in Open MPI. Based >>>> on the FAQ it is supposed to provide a list of resources to be used by the >>>> launcher (in my case ssh) to start the processes. Make sense so far. >>>> >>>> However, if the configuration file contain a value for >>>> orte_default_hostfile, then the behavior of the hostfile option change >>>> drastically, and the option become a filter (the machines must be on the >>>> original list or a cryptic error message is displayed). >>>> >>>> Overall, we have a well defined [mostly] consistent behavior for >>>> parameters in Open MPI. We have an order of precedence of sources of MCA >>>> parameters, clearly defined which make understanding where a value comes >>>> straightforward. I'm absolutely certain there was a group discussion about >>>> this unique "eccentricity" regarding the hostfile option, but I fail to >>>> remember what was the reason we decided to go this way. Can I have a quick >>>> refresh please? >>>> >>>> Thanks, >>>> george. >>>> >>>> >>>> _______________________________________________ >>>> devel mailing list >>>> de...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> >>> >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel