Re: [OMPI devel] C/R code: opal_list_item_destruct: Assertion
That works. Thanks for your fix. On Sun, Dec 22, 2013 at 12:23:44AM +0100, George Bosilca wrote: > Adrian, > > Yes, your patch is correct. However, I noticed that each framework clean it’s > modules differently, so I tried to enforce some level of consistency. Please > try r30045 and let me know if it fixes your issue. > > George. > > > On Dec 21, 2013, at 22:05 , Adrian Reber wrote: > > > Trying to run Open MPI with C/R enabled I get the following error > > with --enable-debug: > > > > [dcbz:20360] orte_rml_base_select: initializing rml component oob > > [dcbz:20360] orte_rml_base_select: initializing rml component ftrm > > [dcbz:20360] orte_rml_base_select: module ftrm unloaded > > orterun: ../../opal/class/opal_list.c:69: opal_list_item_destruct: > > Assertion `0 == item->opal_list_item_refcount' failed. > > [dcbz:20360] *** Process received signal *** > > [dcbz:20360] Signal: Aborted (6) > > [dcbz:20360] Signal code: (-6) > > > > I fixed it like this: > > > > diff --git a/orte/mca/rml/base/rml_base_frame.c > > b/orte/mca/rml/base/rml_base_frame.c > > index 8759180..968884f 100644 > > --- a/orte/mca/rml/base/rml_base_frame.c > > +++ b/orte/mca/rml/base/rml_base_frame.c > > @@ -181,6 +181,7 @@ int orte_rml_base_select(void) > > component->rml_version.mca_component_name); > > > > mca_base_component_repository_release((mca_base_component_t *) > > component); > > + > > opal_list_remove_item(&orte_rml_base_framework.framework_components, item); > > OBJ_RELEASE(item); > > } > > item = next; > > > > > > Is this the correct way to solve an error like this? And the > > correct place. > > > > Adrian > > ___ > > devel mailing list > > de...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] C/R code: opal_list_item_destruct: Assertion
Adrian, Yes, your patch is correct. However, I noticed that each framework clean it’s modules differently, so I tried to enforce some level of consistency. Please try r30045 and let me know if it fixes your issue. George. On Dec 21, 2013, at 22:05 , Adrian Reber wrote: > Trying to run Open MPI with C/R enabled I get the following error > with --enable-debug: > > [dcbz:20360] orte_rml_base_select: initializing rml component oob > [dcbz:20360] orte_rml_base_select: initializing rml component ftrm > [dcbz:20360] orte_rml_base_select: module ftrm unloaded > orterun: ../../opal/class/opal_list.c:69: opal_list_item_destruct: Assertion > `0 == item->opal_list_item_refcount' failed. > [dcbz:20360] *** Process received signal *** > [dcbz:20360] Signal: Aborted (6) > [dcbz:20360] Signal code: (-6) > > I fixed it like this: > > diff --git a/orte/mca/rml/base/rml_base_frame.c > b/orte/mca/rml/base/rml_base_frame.c > index 8759180..968884f 100644 > --- a/orte/mca/rml/base/rml_base_frame.c > +++ b/orte/mca/rml/base/rml_base_frame.c > @@ -181,6 +181,7 @@ int orte_rml_base_select(void) > component->rml_version.mca_component_name); > > mca_base_component_repository_release((mca_base_component_t *) > component); > + > opal_list_remove_item(&orte_rml_base_framework.framework_components, item); > OBJ_RELEASE(item); > } > item = next; > > > Is this the correct way to solve an error like this? And the > correct place. > > Adrian > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] C/R code: opal_list_item_destruct: Assertion
should be okay On Dec 21, 2013, at 1:05 PM, Adrian Reber wrote: > Trying to run Open MPI with C/R enabled I get the following error > with --enable-debug: > > [dcbz:20360] orte_rml_base_select: initializing rml component oob > [dcbz:20360] orte_rml_base_select: initializing rml component ftrm > [dcbz:20360] orte_rml_base_select: module ftrm unloaded > orterun: ../../opal/class/opal_list.c:69: opal_list_item_destruct: Assertion > `0 == item->opal_list_item_refcount' failed. > [dcbz:20360] *** Process received signal *** > [dcbz:20360] Signal: Aborted (6) > [dcbz:20360] Signal code: (-6) > > I fixed it like this: > > diff --git a/orte/mca/rml/base/rml_base_frame.c > b/orte/mca/rml/base/rml_base_frame.c > index 8759180..968884f 100644 > --- a/orte/mca/rml/base/rml_base_frame.c > +++ b/orte/mca/rml/base/rml_base_frame.c > @@ -181,6 +181,7 @@ int orte_rml_base_select(void) > component->rml_version.mca_component_name); > > mca_base_component_repository_release((mca_base_component_t *) > component); > + > opal_list_remove_item(&orte_rml_base_framework.framework_components, item); > OBJ_RELEASE(item); > } > item = next; > > > Is this the correct way to solve an error like this? And the > correct place. > > Adrian > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel
[OMPI devel] C/R code: opal_list_item_destruct: Assertion
Trying to run Open MPI with C/R enabled I get the following error with --enable-debug: [dcbz:20360] orte_rml_base_select: initializing rml component oob [dcbz:20360] orte_rml_base_select: initializing rml component ftrm [dcbz:20360] orte_rml_base_select: module ftrm unloaded orterun: ../../opal/class/opal_list.c:69: opal_list_item_destruct: Assertion `0 == item->opal_list_item_refcount' failed. [dcbz:20360] *** Process received signal *** [dcbz:20360] Signal: Aborted (6) [dcbz:20360] Signal code: (-6) I fixed it like this: diff --git a/orte/mca/rml/base/rml_base_frame.c b/orte/mca/rml/base/rml_base_frame.c index 8759180..968884f 100644 --- a/orte/mca/rml/base/rml_base_frame.c +++ b/orte/mca/rml/base/rml_base_frame.c @@ -181,6 +181,7 @@ int orte_rml_base_select(void) component->rml_version.mca_component_name); mca_base_component_repository_release((mca_base_component_t *) component); + opal_list_remove_item(&orte_rml_base_framework.framework_components, item); OBJ_RELEASE(item); } item = next; Is this the correct way to solve an error like this? And the correct place. Adrian