As they don't even compile why are we keeping them around?

  George.


On Wed, Sep 16, 2015 at 12:05 PM, Nathan Hjelm <hje...@lanl.gov> wrote:

>
> iboffload and bfo are opal ignored by default. Neither exists in the
> release branch.
>
> -Nathan
>
> On Wed, Sep 16, 2015 at 12:02:29PM -0400, George Bosilca wrote:
> >    While looking into a possible fix for this problem we should also
> cleanup
> >    in the trunk the leftover from the OMPI_FREE_LIST.
> >    $find . -name "*.[ch]" -exec grep -Hn OMPI_FREE_LIST_GET_MT {} +
> >    ./opal/mca/btl/usnic/btl_usnic_compat.h:161:
> >     OMPI_FREE_LIST_GET_MT(list, (item))
> >    ./ompi/mca/pml/bfo/pml_bfo_recvreq.h:89:
> >    OMPI_FREE_LIST_GET_MT(&mca_pml_base_recv_requests, item);          \
> >    ./ompi/mca/bcol/iboffload/bcol_iboffload_task.h:149:
> >     OMPI_FREE_LIST_GET_MT(&cm->tasks_free, item);
> >    ./ompi/mca/bcol/iboffload/bcol_iboffload_task.h:206:
> >     OMPI_FREE_LIST_GET_MT(task_list, item);
> >    ./ompi/mca/bcol/iboffload/bcol_iboffload_frag.c:107:
> >     OMPI_FREE_LIST_GET_MT(&device->frags_free[qp_index], item);
> >    ./ompi/mca/bcol/iboffload/bcol_iboffload_frag.c:146:
> >     OMPI_FREE_LIST_GET_MT(&device->frags_free[qp_index], item);
> >    ./ompi/mca/bcol/iboffload/bcol_iboffload_frag.c:208:
> >     OMPI_FREE_LIST_GET_MT(&iboffload->device->frags_free[qp_index],
> item);
> >    ./ompi/mca/bcol/iboffload/bcol_iboffload_qp_info.c:156:
> >     OMPI_FREE_LIST_GET_MT(&device->frags_free[qp_index], item);
> >    ./ompi/mca/bcol/iboffload/bcol_iboffload_collfrag.h:130:
> >     OMPI_FREE_LIST_GET_MT(&cm->collfrags_free, item);
> >    ./ompi/mca/bcol/iboffload/bcol_iboffload_frag.h:115:
> >     OMPI_FREE_LIST_GET_MT(&cm->ml_frags_free, item);
> >    I wonder how these are even compiling ...
> >      George.
> >    On Wed, Sep 16, 2015 at 11:59 AM, George Bosilca <bosi...@icl.utk.edu
> >
> >    wrote:
> >
> >      Alexey,
> >      This is not necessarily the fix for all cases. Most of the internal
> uses
> >      of the free_list can easily accommodate to the fact that no more
> >      elements are available. Based on your description of the problem I
> would
> >      assume you encounter this problem once the
> >      MCA_PML_OB1_RECV_REQUEST_ALLOC is called. In this particular case
> the
> >      problem is that fact that we call OMPI_FREE_LIST_GET_MT and that the
> >      upper level is unable to correctly deal with the case where the
> returned
> >      item is NULL. In this particular case the real fix is to use the
> >      blocking version of the free_list accessor (similar to the case for
> >      send) OMPI_FREE_LIST_WAIT_MT.
> >      It is also possible that I misunderstood your problem. IF the
> solution
> >      above doesn't work can you describe exactly where the NULL return
> of the
> >      OMPI_FREE_LIST_GET_MT is creating an issue?
> >      George.
> >      On Wed, Sep 16, 2015 at 9:03 AM, Aleksej Ryzhih
> >      <avryzh...@compcenter.org> wrote:
> >
> >        Hi all,
> >
> >        We experimented with MPI+OpenMP hybrid application
> >        (MPI_THREAD_MULTIPLE support level)  where several threads
> submits a
> >        lot of MPI_Irecv() requests simultaneously and encountered an
> >        intermittent bug OMPI_ERR_TEMP_OUT_OF_RESOURCE after
> >        MCA_PML_OB1_RECV_REQUEST_ALLOC()  because  OMPI_FREE_LIST_GET_MT()
> >         returned NULL.  Investigating this bug we found that sometimes
> the
> >        thread calling ompi_free_list_grow()  don't have any free items in
> >        LIFO list at exit because other threads  retrieved  all new items
> at
> >        opal_atomic_lifo_pop()
> >
> >        So we suggest to change OMPI_FREE_LIST_GET_MT() as below:
> >
> >
> >
> >        #define OMPI_FREE_LIST_GET_MT(fl,
> >        item)
>     \
> >
> >
> >        {
> >                                  \
> >
> >                item = (ompi_free_list_item_t*)
> >        opal_atomic_lifo_pop(&((fl)->super));             \
> >
> >                if( OPAL_UNLIKELY(NULL == item) )
> >        {                                               \
> >
> >                    if(opal_using_threads())
> >        {                                                    \
> >
> >                        int rc;
> >                                  \
> >
> >
> >        opal_mutex_lock(&((fl)->fl_lock));
> >        \
> >
> >
> >        do
> >        \
> >
> >                        {
> >                                                      \
> >
> >                            rc = ompi_free_list_grow((fl),
> >        (fl)->fl_num_per_alloc);               \
> >
> >                            if( OPAL_UNLIKELY(rc != OMPI_SUCCESS))
> >        break;                         \
> >
> >
> >
> \
> >
> >                            item = (ompi_free_list_item_t*)
> >        opal_atomic_lifo_pop(&((fl)->super)); \
> >
> >
> >        \
> >
> >                        } while
> >        (!item);
> \
> >
> >
> >        opal_mutex_unlock(&((fl)->fl_lock));
> >        \
> >
> >                    } else
> >        {
> >                      \
> >
> >                        ompi_free_list_grow((fl),
> >        (fl)->fl_num_per_alloc);                        \
> >
> >                        item = (ompi_free_list_item_t*)
> >        opal_atomic_lifo_pop(&((fl)->super));     \
> >
> >                    } /* opal_using_threads() */
> >                                          \
> >
> >                } /* NULL == item
> >        */                                                              \
> >
> >            }
> >
> >
> >
> >
> >
> >        Another workaround is to increase the value of
> pml_ob1_free_list_inc
> >        parameter.
> >
> >
> >
> >        Regards,
> >
> >        Alexey
> >
> >
> >
> >        _______________________________________________
> >        devel mailing list
> >        de...@open-mpi.org
> >        Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >        Link to this post:
> >        http://www.open-mpi.org/community/lists/devel/2015/09/18039.php
>
> > _______________________________________________
> > devel mailing list
> > de...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/09/18046.php
>
>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/09/18048.php
>

Reply via email to