I personally prefer the way it's now.
This way guaranties me total control over mapping and allocating slots.
When I am using rankfile mapper, I know exactly what and where I am putting,
OS can easily oversubscribe my CPU with unmapped by rankfile processes. I am
also not sure how it will effect users that have schedulers.
I am also not sure that users, who got used to work with hostfile would
change their scripts according to the mapper.
Lenny.

On Mon, Jun 22, 2009 at 1:23 AM, Ralph Castain <r...@open-mpi.org> wrote:

> Had a chance to think about how this might be done, and looked at it for
> awhile after getting home. I -think- I found a way to do it, but there are a
> couple of caveats:
> 1. Len's point about oversubscribing without warning would definitely hold
> true - this would positively be a "user beware" option
>
> 2. there could be no RM-provided allocation, hostfile, or -host options
> specified. Basically, I would be adding the "read rankfile" option to the
> end of the current allocation determination procedure
>
> I would still allow more procs than shown in the rankfile (mapping the rest
> bynode on the nodes specified in the rankfile - can't do byslot because I
> don't know how many slots are on each node), which means the only change in
> behavior would be the forced bynode mapping of unspecified procs.
>
> So use of this option will entail some risks and a slight difference in
> behavior, but would relieve you from the burden of having to provide a
> hostfile. I'm not personally convinced it is worth the risk and probable
> user complaints of "it didn't work", but since we don't use this option, I
> don't have a strong opinion on the matter.
>
> Let's just avoid going back-and-forth over wanting it, or how it should be
> implemented - let's get it all ironed out, and then implement it once, like
> we finally did at the end with the whole hostfile thing.
>
> Let me know if you want me to do this - it obviously isn't at the top of my
> priority list, but still could be done in the next few weeks.
>
> Ralph
>
>
> On Jun 21, 2009, at 9:00 AM, Lenny Verkhovsky wrote:
>
> Sorry for the delay in response,
> I totally agree with Ralph that it's not as easy as it seems,
> 1. rankfile mapper uses already allocated machines ( by scheduler or
> hostfile ), by using rankfile as a hostfile we can run into problem where
> trying to use unallocated nodes, what can hang the run.
> 2. we can't define in rankfile number of slots on each machine, which means
> oversubscribing can take place without any warning.
> 3. I personally dont see any problem using hostfile, even if it has
> redundant info, hostfile and rankfile belong to different layers in the
> system and solve different problems. The original hostfile ( if I recall
> correctly ) could bind rank to the node, but the syntax wasn't very flexible
> and clear.
> Lenny.
>
> On Sun, Jun 21, 2009 at 5:15 PM, Ralph Castain <r...@open-mpi.org> wrote:
>
>> Let me suggest a two-step process, then:
>> 1. let's change the error message as this is easily done and thus can be
>> done now
>>
>> 2. I can look at how to eat the rankfile as a hostfile. This may not even
>> be possible - the problem is that the entire system is predicated on certain
>> ordering due to our framework architecture. So we get an allocation, and
>> then do a mapping against that allocation, filtering the allocation through
>> hostfiles, -host, and other options.
>>
>> By the time we reach the rankfile mapper, we have already determined that
>> we don't have an allocation and have to abort. It is the rankfile mapper
>> itself that looks for the -rankfile option, so the system can have no
>> knowledge that someone has specified that option before that point - and
>> thus, even if I could parse the rankfile, I don't know it was given!
>>
>> What will take time is to figure out a way to either:
>>
>> (a) allow us to run the mapper even though we don't have any nodes we know
>> about, and allow the mapper to insert the nodes itself - without causing
>> non-rankfile uses to break (which could be a major feat); or
>>
>> (b) have the overall system check for the rankfile option and pass it as a
>> hostfile as well, assuming that a hostfile wasn't also given, no RM-based
>> allocation exists, etc. - which breaks our abstraction rules and also opens
>> a possible can of worms.
>>
>> Either way, I also then have to teach the hostfile parser how to realize
>> it is a rankfile format and convert the info in it into what we expected to
>> receive from a hostfile - another non-trivial problem.
>>
>> I'm willing to give it a try - just trying to make clear why my response
>> was negative. It isn't as simple as it sounds...which is why Len and I
>> didn't pursue it when this was originally developed.
>>
>> Ralph
>>
>>
>> On Sun, Jun 21, 2009 at 5:28 AM, Terry Dontje <terry.don...@sun.com>wrote:
>>
>>> Being a part of these discussions I can understand your reticence to
>>> reopen this discussion.  However, I think this is a major usability issue
>>> with this feature which actually is fairly important in order to get things
>>> to run performant. Which IMO is important.
>>>
>>> That being said I think there are one of two things that could be done to
>>> mitigate the issue.
>>>
>>> 1.  To eliminate the element of surprise by changing mpirun to eat
>>> rankfile without the hostfile.
>>> 2.  To change the error message to something understandable by the user
>>> such that they
>>> know they might be missing the hostfile option.
>>>
>>> Again I understand this topic is frustrating and there are some
>>> boundaries with the design that make these two option orthogonal to each
>>> other but I really believe we need to make the rankfile option something
>>> that is easily usable by our users.
>>>
>>>
>>> --td
>>>
>>> Ralph Castain wrote:
>>>
>>>> Having gone around in circles on hostfile-related issues for over five
>>>> years now, I honestly have little motivation to re-open the entire
>>>> discussion again. It doesn't seem to be that daunting a requirement for
>>>> those who are using it, so I'm inclined to just leave well enough alone.
>>>>
>>>> :-)
>>>>
>>>>
>>>> On Fri, Jun 19, 2009 at 2:21 PM, Eugene Loh <eugene....@sun.com<mailto:
>>>> eugene....@sun.com>> wrote:
>>>>
>>>>    Ralph Castain wrote:
>>>>
>>>>>    The two files have a slightly different format
>>>>>
>>>>    Agreed.
>>>>
>>>>>    and completely different meaning.
>>>>>
>>>>    Somewhat agreed.  They're both related to mapping processes onto a
>>>>    cluster.
>>>>
>>>>     The hostfile specifies how many slots are on a node. The rankfile
>>>>>    specifies a rank and what node/slot it is to be mapped onto.
>>>>>
>>>>    Agreed.
>>>>
>>>>     Rankfiles can use relative node indexing and refer to nodes
>>>>>    received from a resource manager - i.e., without any hostfile.
>>>>>
>>>>    This is the main part I'm concerned about.  E.g.,
>>>>
>>>>    % cat rankfile
>>>>    rank 0=node0 slot=0
>>>>    rank 1=node1 slot=0
>>>>    % mpirun -np 2 -rf rankfile ./a.out
>>>>
>>>>  --------------------------------------------------------------------------
>>>>    Rankfile claimed host node1 that was not allocated or
>>>>    oversubscribed it's slots:
>>>>
>>>>
>>>>  --------------------------------------------------------------------------
>>>>    [node0:14611] [[61560,0],0] ORTE_ERROR_LOG: Bad parameter in file
>>>>    rmaps_rank_file.c at line 107
>>>>    [node0:14611] [[61560,0],0] ORTE_ERROR_LOG: Bad parameter in file
>>>>    base/rmaps_base_map_job.c at line 86
>>>>    [node0:14611] [[61560,0],0] ORTE_ERROR_LOG: Bad parameter in file
>>>>    base/plm_base_launch_support.c at line 86
>>>>    [node0:14611] [[61560,0],0] ORTE_ERROR_LOG: Bad parameter in file
>>>>    plm_rsh_module.c at line 1016
>>>>    % mpirun -np 2 -host node0,node1 -rf rankfile ./a.out
>>>>    0 on node0
>>>>    1 on node1
>>>>    done
>>>>
>>>>    It seems to me that the rankfile has sufficient information to
>>>>    express what I want it to do.  But mpirun won't accept this.  To
>>>>    fix this, I have to, e.g., supply/maintain/specify redundant
>>>>    information in a hostfile or host list.
>>>>
>>>>     So the files are intentionally quite different. Trying to combine
>>>>>    them would be rather ugly.
>>>>>
>>>>    Right.  And my issue is that I'm forced to use both when I only
>>>>    want rankfile functionality.
>>>>
>>>>     On Thu, Jun 18, 2009 at 1:52 PM, Eugene Loh <eugene....@sun.com
>>>>>    <mailto:eugene....@sun.com>> wrote:
>>>>>
>>>>>        In order to use "mpirun --rankfile", I also need to specify
>>>>>        hosts/hostlist.  But that information is redundant with what
>>>>>        I provide in the rankfile.  So, from a user's point of view,
>>>>>        this strikes me as broken.  Yes?  Should I file a ticket, or
>>>>>        am I missing something here about this functionality?
>>>>>
>>>>>
>>>>    _______________________________________________
>>>>    devel mailing list
>>>>    de...@open-mpi.org <mailto:de...@open-mpi.org>
>>>>    http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>>>
>>>> ------------------------------------------------------------------------
>>>>
>>>> _______________________________________________
>>>> devel mailing list
>>>> de...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>>>
>>>
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>
>>
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

Reply via email to