I personally prefer the way it's now. This way guaranties me total control over mapping and allocating slots. When I am using rankfile mapper, I know exactly what and where I am putting, OS can easily oversubscribe my CPU with unmapped by rankfile processes. I am also not sure how it will effect users that have schedulers. I am also not sure that users, who got used to work with hostfile would change their scripts according to the mapper. Lenny.
On Mon, Jun 22, 2009 at 1:23 AM, Ralph Castain <r...@open-mpi.org> wrote: > Had a chance to think about how this might be done, and looked at it for > awhile after getting home. I -think- I found a way to do it, but there are a > couple of caveats: > 1. Len's point about oversubscribing without warning would definitely hold > true - this would positively be a "user beware" option > > 2. there could be no RM-provided allocation, hostfile, or -host options > specified. Basically, I would be adding the "read rankfile" option to the > end of the current allocation determination procedure > > I would still allow more procs than shown in the rankfile (mapping the rest > bynode on the nodes specified in the rankfile - can't do byslot because I > don't know how many slots are on each node), which means the only change in > behavior would be the forced bynode mapping of unspecified procs. > > So use of this option will entail some risks and a slight difference in > behavior, but would relieve you from the burden of having to provide a > hostfile. I'm not personally convinced it is worth the risk and probable > user complaints of "it didn't work", but since we don't use this option, I > don't have a strong opinion on the matter. > > Let's just avoid going back-and-forth over wanting it, or how it should be > implemented - let's get it all ironed out, and then implement it once, like > we finally did at the end with the whole hostfile thing. > > Let me know if you want me to do this - it obviously isn't at the top of my > priority list, but still could be done in the next few weeks. > > Ralph > > > On Jun 21, 2009, at 9:00 AM, Lenny Verkhovsky wrote: > > Sorry for the delay in response, > I totally agree with Ralph that it's not as easy as it seems, > 1. rankfile mapper uses already allocated machines ( by scheduler or > hostfile ), by using rankfile as a hostfile we can run into problem where > trying to use unallocated nodes, what can hang the run. > 2. we can't define in rankfile number of slots on each machine, which means > oversubscribing can take place without any warning. > 3. I personally dont see any problem using hostfile, even if it has > redundant info, hostfile and rankfile belong to different layers in the > system and solve different problems. The original hostfile ( if I recall > correctly ) could bind rank to the node, but the syntax wasn't very flexible > and clear. > Lenny. > > On Sun, Jun 21, 2009 at 5:15 PM, Ralph Castain <r...@open-mpi.org> wrote: > >> Let me suggest a two-step process, then: >> 1. let's change the error message as this is easily done and thus can be >> done now >> >> 2. I can look at how to eat the rankfile as a hostfile. This may not even >> be possible - the problem is that the entire system is predicated on certain >> ordering due to our framework architecture. So we get an allocation, and >> then do a mapping against that allocation, filtering the allocation through >> hostfiles, -host, and other options. >> >> By the time we reach the rankfile mapper, we have already determined that >> we don't have an allocation and have to abort. It is the rankfile mapper >> itself that looks for the -rankfile option, so the system can have no >> knowledge that someone has specified that option before that point - and >> thus, even if I could parse the rankfile, I don't know it was given! >> >> What will take time is to figure out a way to either: >> >> (a) allow us to run the mapper even though we don't have any nodes we know >> about, and allow the mapper to insert the nodes itself - without causing >> non-rankfile uses to break (which could be a major feat); or >> >> (b) have the overall system check for the rankfile option and pass it as a >> hostfile as well, assuming that a hostfile wasn't also given, no RM-based >> allocation exists, etc. - which breaks our abstraction rules and also opens >> a possible can of worms. >> >> Either way, I also then have to teach the hostfile parser how to realize >> it is a rankfile format and convert the info in it into what we expected to >> receive from a hostfile - another non-trivial problem. >> >> I'm willing to give it a try - just trying to make clear why my response >> was negative. It isn't as simple as it sounds...which is why Len and I >> didn't pursue it when this was originally developed. >> >> Ralph >> >> >> On Sun, Jun 21, 2009 at 5:28 AM, Terry Dontje <terry.don...@sun.com>wrote: >> >>> Being a part of these discussions I can understand your reticence to >>> reopen this discussion. However, I think this is a major usability issue >>> with this feature which actually is fairly important in order to get things >>> to run performant. Which IMO is important. >>> >>> That being said I think there are one of two things that could be done to >>> mitigate the issue. >>> >>> 1. To eliminate the element of surprise by changing mpirun to eat >>> rankfile without the hostfile. >>> 2. To change the error message to something understandable by the user >>> such that they >>> know they might be missing the hostfile option. >>> >>> Again I understand this topic is frustrating and there are some >>> boundaries with the design that make these two option orthogonal to each >>> other but I really believe we need to make the rankfile option something >>> that is easily usable by our users. >>> >>> >>> --td >>> >>> Ralph Castain wrote: >>> >>>> Having gone around in circles on hostfile-related issues for over five >>>> years now, I honestly have little motivation to re-open the entire >>>> discussion again. It doesn't seem to be that daunting a requirement for >>>> those who are using it, so I'm inclined to just leave well enough alone. >>>> >>>> :-) >>>> >>>> >>>> On Fri, Jun 19, 2009 at 2:21 PM, Eugene Loh <eugene....@sun.com<mailto: >>>> eugene....@sun.com>> wrote: >>>> >>>> Ralph Castain wrote: >>>> >>>>> The two files have a slightly different format >>>>> >>>> Agreed. >>>> >>>>> and completely different meaning. >>>>> >>>> Somewhat agreed. They're both related to mapping processes onto a >>>> cluster. >>>> >>>> The hostfile specifies how many slots are on a node. The rankfile >>>>> specifies a rank and what node/slot it is to be mapped onto. >>>>> >>>> Agreed. >>>> >>>> Rankfiles can use relative node indexing and refer to nodes >>>>> received from a resource manager - i.e., without any hostfile. >>>>> >>>> This is the main part I'm concerned about. E.g., >>>> >>>> % cat rankfile >>>> rank 0=node0 slot=0 >>>> rank 1=node1 slot=0 >>>> % mpirun -np 2 -rf rankfile ./a.out >>>> >>>> -------------------------------------------------------------------------- >>>> Rankfile claimed host node1 that was not allocated or >>>> oversubscribed it's slots: >>>> >>>> >>>> -------------------------------------------------------------------------- >>>> [node0:14611] [[61560,0],0] ORTE_ERROR_LOG: Bad parameter in file >>>> rmaps_rank_file.c at line 107 >>>> [node0:14611] [[61560,0],0] ORTE_ERROR_LOG: Bad parameter in file >>>> base/rmaps_base_map_job.c at line 86 >>>> [node0:14611] [[61560,0],0] ORTE_ERROR_LOG: Bad parameter in file >>>> base/plm_base_launch_support.c at line 86 >>>> [node0:14611] [[61560,0],0] ORTE_ERROR_LOG: Bad parameter in file >>>> plm_rsh_module.c at line 1016 >>>> % mpirun -np 2 -host node0,node1 -rf rankfile ./a.out >>>> 0 on node0 >>>> 1 on node1 >>>> done >>>> >>>> It seems to me that the rankfile has sufficient information to >>>> express what I want it to do. But mpirun won't accept this. To >>>> fix this, I have to, e.g., supply/maintain/specify redundant >>>> information in a hostfile or host list. >>>> >>>> So the files are intentionally quite different. Trying to combine >>>>> them would be rather ugly. >>>>> >>>> Right. And my issue is that I'm forced to use both when I only >>>> want rankfile functionality. >>>> >>>> On Thu, Jun 18, 2009 at 1:52 PM, Eugene Loh <eugene....@sun.com >>>>> <mailto:eugene....@sun.com>> wrote: >>>>> >>>>> In order to use "mpirun --rankfile", I also need to specify >>>>> hosts/hostlist. But that information is redundant with what >>>>> I provide in the rankfile. So, from a user's point of view, >>>>> this strikes me as broken. Yes? Should I file a ticket, or >>>>> am I missing something here about this functionality? >>>>> >>>>> >>>> _______________________________________________ >>>> devel mailing list >>>> de...@open-mpi.org <mailto:de...@open-mpi.org> >>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>> >>>> >>>> ------------------------------------------------------------------------ >>>> >>>> _______________________________________________ >>>> devel mailing list >>>> de...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>> >>>> >>> >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> >> >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel >