Since my goal is to eliminate the modex completely for managed installations, could you give me a brief understanding of this eventual PML selection logic? It would help to hear an example of how and why different procs could get different answers - and why we would want to allow them to do so.
Thanks Ralph On 6/23/08 11:59 AM, "Aurélien Bouteiller" <boute...@eecs.utk.edu> wrote: > The first approach sounds fair enough to me. We should avoid 2 and 3 > as the pml selection mechanism used to be > more complex before we reduced it to accommodate a major design bug in > the BTL selection process. When using the complete PML selection, BTL > would be initialized several times, leading to a variety of bugs. > Eventually the PML selection should return to its old self, when the > BTL bug gets fixed. > > Aurelien > > Le 23 juin 08 à 12:36, Ralph H Castain a écrit : > >> Yo all >> >> I've been doing further research into the modex and came across >> something I >> don't fully understand. It seems we have each process insert into >> the modex >> the name of the PML module that it selected. Once the modex has >> exchanged >> that info, it then loops across all procs in the job to check their >> selection, and aborts if any proc picked a different PML module. >> >> All well and good...assuming that procs actually -can- choose >> different PML >> modules and hence create an "abort" scenario. However, if I look >> inside the >> PML's at their selection logic, I find that a proc can ONLY pick a >> module >> other than ob1 if: >> >> 1. the user specifies the module to use via -mca pml xyz or by using a >> module specific mca param to adjust its priority. In this case, >> since the >> mca param is propagated, ALL procs have no choice but to pick that >> same >> module, so that can't cause us to abort (we will have already >> returned an >> error and aborted if the specified module can't run). >> >> 2. the pml/cm module detects that an MTL module was selected, and >> that it is >> other than "psm". In this case, the CM module will be selected >> because its >> default priority is higher than that of OB1. >> >> In looking deeper into the MTL selection logic, it appears to me >> that you >> either have the required capability or you don't. I can see that in >> some >> environments (e.g., rsh across unmanaged collections of machines), >> it might >> be possible for someone to launch across a set of machines where >> some do and >> some don't have the required support. However, in all other cases, >> this will >> be homogeneous across the system. >> >> Given this analysis (and someone more familiar with the PML should >> feel free >> to confirm or correct it), it seems to me that this could be >> streamlined via >> one or more means: >> >> 1. at the most, we could have rank=0 add the PML module name to the >> modex, >> and other procs simply check it against their own and return an >> error if >> they differ. This accomplishes the identical functionality to what >> we have >> today, but with much less info in the modex. >> >> 2. we could eliminate this info from the modex altogether by >> requiring the >> user to specify the PML module if they want something other than the >> default >> OB1. In this case, there can be no confusion over what each proc is >> to use. >> The CM module will attempt to init the MTL - if it cannot do so, >> then the >> job will return the correct error and tell the user that CM/MTL >> support is >> unavailable. >> >> 3. we could again eliminate the info by not inserting it into the >> modex if >> (a) the default PML module is selected, or (b) the user specified >> the PML >> module to be used. In the first case, each proc can simply check to >> see if >> they picked the default - if not, then we can insert the info to >> indicate >> the difference. Thus, in the "standard" case, no info will be >> inserted. >> >> In the second case, we will already get an error if the specified >> PML module >> could not be used. Hence, the modex check provides no additional >> info or >> value. >> >> I understand the motivation to support automation. However, in this >> case, >> the automation actually doesn't seem to buy us very much, and it isn't >> coming "free". So perhaps some change in how this is done would be >> in order? >> >> Ralph >> >> >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel