The formatting of the code got all messed up. Please send a diff and I will take a look. ompi free list no longer exists in master or the next release branch but the change may be worthwhile for the opal free list code.
-Nathan On Wed, Sep 16, 2015 at 04:03:44PM +0300, Алексей Рыжих wrote: > Hi all, > > We experimented with MPI+OpenMP hybrid application (MPI_THREAD_MULTIPLE > support level) where several threads submits a lot of MPI_Irecv() > requests simultaneously and encountered an intermittent bug > OMPI_ERR_TEMP_OUT_OF_RESOURCE after MCA_PML_OB1_RECV_REQUEST_ALLOC() > because OMPI_FREE_LIST_GET_MT() returned NULL. Investigating this bug > we found that sometimes the thread calling ompi_free_list_grow() don't > have any free items in LIFO list at exit because other threads retrieved > all new items at opal_atomic_lifo_pop() > > So we suggest to change OMPI_FREE_LIST_GET_MT() as below: > > > > #define OMPI_FREE_LIST_GET_MT(fl, item) > \ > > { > \ > > item = (ompi_free_list_item_t*) > opal_atomic_lifo_pop(&((fl)->super)); \ > > if( OPAL_UNLIKELY(NULL == item) ) > { \ > > if(opal_using_threads()) > { \ > > int rc; > \ > > > opal_mutex_lock(&((fl)->fl_lock)); > \ > > > do > \ > > { > \ > > rc = ompi_free_list_grow((fl), > (fl)->fl_num_per_alloc); \ > > if( OPAL_UNLIKELY(rc != OMPI_SUCCESS)) > break; \ > > > \ > > item = (ompi_free_list_item_t*) > opal_atomic_lifo_pop(&((fl)->super)); \ > > > > \ > > } while > (!item); \ > > > opal_mutex_unlock(&((fl)->fl_lock)); > \ > > } else > { \ > > ompi_free_list_grow((fl), > (fl)->fl_num_per_alloc); \ > > item = (ompi_free_list_item_t*) > opal_atomic_lifo_pop(&((fl)->super)); \ > > } /* opal_using_threads() */ > \ > > } /* NULL == item > */ \ > > } > > > > > > Another workaround is to increase the value of pml_ob1_free_list_inc > parameter. > > > > Regards, > > Alexey > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/09/18039.php
pgpRE9F8AQdun.pgp
Description: PGP signature