I really would like to help, but I am not sure how much time will I have in the very near future ( we are expecting a babygirl delivery ).
On 8/6/08, Open MPI <b...@open-mpi.org> wrote: > > #1435: Crash on PPC (with SMT off) when using mpi_paffinity alone > > -------------------+-------------------------------------------------------- > > Reporter: jnysal | Owner: rhc > > Type: defect | Status: new > Priority: major | Milestone: Open MPI 1.3 > > Version: | Resolution: > Keywords: | > > -------------------+-------------------------------------------------------- > Changes (by rhc): > > * owner: jnysal => rhc > * status: assigned => new > > > Comment: > > Several of us have had a telecon on this subject, and have a proposed > solution: > > The real root of the problem here is that we never clearly delineated > between physical and logical processors in OMPI. Instead, there was an > implicit assumption that the two were one-and-the-same. Thus, if a user > specified a slot_list, we just directly dumped that into the paffinity > subsystem. > > Unfortunately, when we use paffinity_alone and automatically map the > ranks > to processors, we again just passed the info the paffinity subsystem - > without clearly indicating whether this was a physical processor or > logical processor. > > Our feeling is that we need to cleanly handle both physical and logical > processor specifications. Accordingly, we propose to do the following: > > 1. modify the opal_paffinity_base_get API to add a boolean flag > indicating > we want logical (true) or physical (false) processor id's in the returned > cpumask > > 2. modify the opal_paffinity_base_set API to add a boolean flag > indicating > we provided logical (true) or physical (false) processor id's in the > cpumask > > 3. modify the opal_paffinity linux and solaris components to do the > necessary mapping to handle the two cases so that we bind or return data > according to the new flag > > 4. modify ompi_mpi_init so that mpi_paffinity_alone indicates the > automatic binding is to be done on the basis of logical processor id's > > 5. modify the syntax of the slot_list mca param so that it defaults to > logical processor ids, but allows the user to prepend the specification > with a "P" or "p" to indicate these are physical processor id's. This > will > also be applied to the parsing of the rank_file mapping file. > > 6. modify the places that utilize that param to handle the new syntax, > including the opal_paffinity_base_slot_list_set and its companion > functions > > 7. update the documentation to reflect the changed syntax > > Terry has volunteered to modify the paffinity components. Ralph will do > the ORTE-level stuff and mpi_init, and likely the slot_list stuff too > (unless Lenny has time and is willing to help there?). This will be done > on a new Hg branch that Ralph will create - will post the access info > here > later today. > > Any comments? Please post soon so we don't go too far down path before we > hear them! > > > -- > Ticket URL: <https://svn.open-mpi.org/trac/ompi/ticket/1435#comment:18> > > Open MPI <http://www.open-mpi.org/> > > > _______________________________________________ > bugs mailing list > b...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/bugs >