No problem Lenny, I am looking at this now.

--td

Lenny Verkhovsky wrote:

I really would like to help, but I am not sure how much time will I have in the very near future ( we are expecting a babygirl delivery ).

On 8/6/08, *Open MPI* <b...@open-mpi.org <mailto:b...@open-mpi.org>> wrote:

    #1435: Crash on PPC (with SMT off) when using mpi_paffinity alone
    -------------------+--------------------------------------------------------

    Reporter:  jnysal  |        Owner:  rhc

        Type:  defect  |       Status:  new
    Priority:  major   |    Milestone:  Open MPI 1.3

    Version:          |   Resolution:
    Keywords:          |
    -------------------+--------------------------------------------------------
    Changes (by rhc):

      * owner:  jnysal => rhc
      * status:  assigned => new


    Comment:

      Several of us have had a telecon on this subject, and have a
    proposed
      solution:

      The real root of the problem here is that we never clearly
    delineated
      between physical and logical processors in OMPI. Instead, there
    was an
      implicit assumption that the two were one-and-the-same. Thus, if
    a user
      specified a slot_list, we just directly dumped that into the
    paffinity
      subsystem.

      Unfortunately, when we use paffinity_alone and automatically map
    the ranks
      to processors, we again just passed the info the paffinity
    subsystem -
      without clearly indicating whether this was a physical processor or
      logical processor.

      Our feeling is that we need to cleanly handle both physical and
    logical
      processor specifications. Accordingly, we propose to do the
    following:

      1. modify the opal_paffinity_base_get API to add a boolean flag
    indicating
      we want logical (true) or physical (false) processor id's in the
    returned
      cpumask

      2. modify the opal_paffinity_base_set API to add a boolean flag
    indicating
      we provided logical (true) or physical (false) processor id's in the
      cpumask

      3. modify the opal_paffinity linux and solaris components to do the
      necessary mapping to handle the two cases so that we bind or
    return data
      according to the new flag

      4. modify ompi_mpi_init so that mpi_paffinity_alone indicates the
      automatic binding is to be done on the basis of logical
    processor id's

      5. modify the syntax of the slot_list mca param so that it
    defaults to
      logical processor ids, but allows the user to prepend the
    specification
      with a "P" or "p" to indicate these are physical processor id's.
    This will
      also be applied to the parsing of the rank_file mapping file.

      6. modify the places that utilize that param to handle the new
    syntax,
      including the opal_paffinity_base_slot_list_set and its companion
      functions

      7. update the documentation to reflect the changed syntax

      Terry has volunteered to modify the paffinity components. Ralph
    will do
      the ORTE-level stuff and mpi_init, and likely the slot_list
    stuff too
      (unless Lenny has time and is willing to help there?). This will
    be done
      on a new Hg branch that Ralph will create - will post the access
    info here
      later today.

      Any comments? Please post soon so we don't go too far down path
    before we
      hear them!


    --
    Ticket URL:
    <https://svn.open-mpi.org/trac/ompi/ticket/1435#comment:18>

    Open MPI <http://www.open-mpi.org/>


    _______________________________________________
    bugs mailing list
    b...@open-mpi.org <mailto:b...@open-mpi.org>
    http://www.open-mpi.org/mailman/listinfo.cgi/bugs


------------------------------------------------------------------------

_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Reply via email to