I've attached a solution that blocks the segfault without requiring any gyrations. Can someone explain why this isn't adequate?

Alternate solution was to simply decrement opal_util_initialized in MPI_T_finalize rather than calling finalize itself. Either way resolves the problem in a very simple manner.

Attachment: fix.diff
Description: Binary data

Attachment: mpit.c
Description: Binary data


On Jul 15, 2014, at 6:10 PM, Ralph Castain <r...@open-mpi.org> wrote:

I'm unsure where Intel's compilers sit on that list.

When you say it works except for reinit, are you saying that the only issue here is that MPI_T_Finalize is calling opal_finalize_util solely because of the valgrind cleanup? And if it didn't do that, we would leak but would otherwise be just fine?

Just checking my understanding. Looking at the code, that would certainly appear to be true due to the reference counter in there, which would prevent us from eventually cleaning up because the counter wouldn't reach zero. However, couldn't we resolve that by (a) having MPI_T_Init set a global flag indicating it was called, and then (b) in opal_finalize, check the flag and add another call to opal_finalize_util if the flag is set?

Seems like all we really need to do is ensure that the init/finalize calls match, and that is far easier to ensure than doing the rest of this stuff.


On Jul 15, 2014, at 5:48 PM, George Bosilca <bosi...@icl.utk.edu> wrote:

Enforcing the portability of this sounds like a huge [almost impossible] mess, without a clean portable solution (more about this below). However, few things should be considered:
- Except for reinit, Open MPI works without it! If we provide such a capability it will be more a convenience capability to keep valgrind happy, than a necessity
- in case the constructor/destructor functionality is available we explicitly control the ordering in which the shared libraries are opened/closed as we control the dl_open/dl_close for most of the shared libraries.

  George.

PS: Other cases about shared libraries constructor/destructor.



On Tue, Jul 15, 2014 at 8:06 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
The priority appears to have been added in gcc 4.3.

I also don't think the presence of the priority argument fixes anything...

An OpenMPI code author cannot change the "priority" of a ctor or dtor in a precompiled third-party library (libpmi comes to mind).  Nor can one know what value the third part chose (in order to be higher or lower than theirs).  You cannot even be assured the third-party didn't set priority to INT_MIN or INT_MAX (or whatever).

That text also says nothing about dl_open() and dl_close() which must be considered in Open MPI.

Before assuming constructor/destructor attributes are going to save the world, wash your dog, and pick up the dry cleaning, one should probably verify some minimal level of support on non-gnu tool-chains including vendor compilers (PGI, XLC, etc) and system linkers (Darwin and Solaris).

-Paul


On Tue, Jul 15, 2014 at 4:52 PM, Joshua Ladd <jladd.m...@gmail.com> wrote:
According to http://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html

"constructor

 destructor

 constructor (priority)
 destructor (priority)
The constructor attribute causes the function to be called automatically before execution enters main (). Similarly, the destructor attribute causes the function to be called automatically after main () completes or exit () is called. Functions with these attributes are useful for initializing data that is used implicitly during the execution of the program.

You may provide an optional integer priority to control the order in which constructor and destructor functions are run. A constructor with a smaller priority number runs before a constructor with a larger priority number; the opposite relationship holds for destructors. So, if you have a constructor that allocates a resource and a destructor that deallocates the same resource, both functions typically have the same priority. The priorities for constructor and destructor functions are the same as those specified for namespace-scope C++ objects (see C++ Attributes).

These attributes are not currently implemented for Objective-C."




On Tue, Jul 15, 2014 at 5:20 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:

On Tue, Jul 15, 2014 at 12:49 PM, Pritchard, Howard r <howa...@lanl.gov> wrote:
I don't think there's anything wrong with using ctor/dtors in shared libraries,
but one does need to make sure that in these functions there's no assumptions
about ordering of them wrt to other ctors/dtors.

The ELF specification is clear that the order of execution of DT_INIT and DT_FINI entries is undefined.
The .ctors and .dtors sections typically used by the GNU toolchain are, I believe, not part of any formal linker specification.
So, I agree w/ Howard that one must take care not to assume anything about order.

-Paul


--
Paul H. Hargrove                          phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department     Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900

_______________________________________________
devel mailing list
de...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: http://www.open-mpi.org/community/lists/devel/2014/07/15153.php


_______________________________________________
devel mailing list
de...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: http://www.open-mpi.org/community/lists/devel/2014/07/15155.php



--
Paul H. Hargrove                          phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department     Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900

_______________________________________________
devel mailing list
de...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: http://www.open-mpi.org/community/lists/devel/2014/07/15156.php

_______________________________________________
devel mailing list
de...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: http://www.open-mpi.org/community/lists/devel/2014/07/15158.php


Reply via email to