On Jul 30, 2012, at 15:29 , Ralph Castain wrote:

> 
> On Jul 30, 2012, at 2:37 AM, George Bosilca wrote:
> 
>> I think that as long as there is a single home area per cluster the 
>> difference between the different approaches might seem irrelevant to most of 
>> the people.
> 
> Yeah, I agree - after thinking about it, it probably didn't accomplish much.
> 
>> 
>> My problem is twofold. First, I have a common home area across several 
>> different development clusters. Thus I have direct access through ssh to any 
>> machine. If I create a single large machinefile, it turns out that every 
>> mpirun will spawn a daemon on every single node, even if I only run a 
>> ping-pong test.
> 
> That shouldn't happen if you specify the hosts you want to use, either via 
> -host or -hostfile. I assume you are specifying nothing and so you get that 
> behavior?
> 
>> Second, while I usually run my apps on the same set of resources I need on a 
>> regular base to switch my nodes for few tests.
>> 
>> What I was hoping to achieve is a machinefile containing the "default" 
>> development cluster (aka. the cluster where I'm almost alone so my deamons 
>> have minimal chances to disturb other people experiences), and then use a 
>> machinefile to sporadicly change the cluster where I run for smaller tests. 
>> Unfortunately, this doesn't work due to the filtering behavior described in 
>> my original email.
> 
> Why not just set the default hostfile to point to the new machinefile via the 
> "--default-hostfile foo" option to mpirun, or you can use the corresponding 
> MCA param?

I confirm, if instead of -machinefile I use --default-hostfile I get the 
behavior I expected (it overwrites the default).

> I'm not trying to re-open the hostfile discussion, but I would be interested 
> to hear how you feel -hostfile should work. I kinda gather you feel it should 
> override the default hostfile instead of filter it, yes? My point being that 
> I don't particularly know if anyone would disagree with that behavior, so we 
> might decide to modify things if you want to propose it.

Right, I would have expected to work in the same way as almost all the other 
MCA parameters, by overwriting the less variants with less priority. But I 
don't mind typing "--default-hostfile" instead of "-machinefile" to get the 
behavior I like.

  george.

> 
> Ralph
> 
> 
>> 
>> george.
>> 
>> 
>> On Jul 28, 2012, at 19:24 , Ralph Castain wrote:
>> 
>>> It's been awhile, but I vaguely remember the discussion. IIRC, the 
>>> rationale was that the default hostfile was equivalent to an RM allocation 
>>> and should be treated the same. So hostfile and -host become filters in 
>>> that case.
>>> 
>>> FWIW, I believe the discussion was split on that question. I added a "none" 
>>> option to the default hostfile MCA param so it would be ignored in the case 
>>> where (a) the sys admin has given a default hostfile, but (b) someone wants 
>>> to use hosts outside of it.
>>> 
>>>              MCA orte: parameter "orte_default_hostfile" (current value: 
>>> <none>, data source: default value)
>>>                        Name of the default hostfile (relative or absolute 
>>> path, "none" to ignore environmental or default MCA param setting)
>>> 
>>> That said, I can see a use-case argument for behaving somewhat differently. 
>>> We've even had cases where users have gotten an allocation from an RM, but 
>>> want to add hosts that are external to the cluster to the job.
>>> 
>>> It would be rather trivial to modify the logic:
>>> 
>>> 1. read the default hostfile or RM allocation for our baseline
>>> 
>>> 2. remove any hosts on that list that are *not* in the given hostfile
>>> 
>>> 3. add any hosts that are in the given hostfile, but weren't in the default 
>>> hostfile
>>> 
>>> And subsequently do the same for -host. I think that would retain the 
>>> spirit of the discussion, but provide more flexibility and provide a tad 
>>> more "expected" behavior.
>>> 
>>> I don't have an iron in this fire as I don't use hostfiles, so I'm happy to 
>>> implement whatever the community would like to see.
>>> Ralph
>>> 
>>> On Jul 27, 2012, at 6:30 PM, George Bosilca wrote:
>>> 
>>>> I'm somewhat puzzled by the behavior of the -hostfile in Open MPI. Based 
>>>> on the FAQ it is supposed to provide a list of resources to be used by the 
>>>> launcher (in my case ssh) to start the processes. Make sense so far.
>>>> 
>>>> However, if the configuration file contain a value for 
>>>> orte_default_hostfile, then the behavior of the hostfile option change 
>>>> drastically, and the option become a filter (the machines must be on the 
>>>> original list or a cryptic error message is displayed).
>>>> 
>>>> Overall, we have a well defined [mostly] consistent behavior for 
>>>> parameters in Open MPI. We have an order of precedence of sources of MCA 
>>>> parameters, clearly defined which make understanding where a value comes 
>>>> straightforward. I'm absolutely certain there was a group discussion about 
>>>> this unique "eccentricity" regarding the hostfile option, but I fail to 
>>>> remember what was the reason we decided to go this way. Can I have a quick 
>>>> refresh please?
>>>> 
>>>> Thanks,
>>>> george.
>>>> 
>>>> 
>>>> _______________________________________________
>>>> devel mailing list
>>>> de...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> 
>>> 
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


Reply via email to