Edgar, The restriction you are facing doesn't come from Open MPI, but instead it comes from the default behavior of how dlopen loads the .so files. As we do not manually force the RTLD_GLOBAL flag the scope of our modules is local, which means that the symbols defined in this library are not made available to resolve references in subsequently loaded libraries.
On Wed, Nov 26, 2014 at 11:27 AM, Ralph Castain <r...@open-mpi.org> wrote: > > > On Nov 26, 2014, at 7:16 AM, Edgar Gabriel <gabr...@cs.uh.edu> wrote: > > > > ok, so I thought about it a bit, and while I am still baffled by the > actual outcome and the missing symbol (for the main reason that the > function of the fcoll component is being called from the ompio module, so > the function of the ompio that was called from the fcoll component is > guaranteed to be loaded, and does have the proper OMPI_DECLSPEC), I will do > some restructuring of the code to handle that. > > > > As an explanation on why there are so many functions in ompio that are > being called from the sub-frameworks directly, ompio is more or less the > glue between all the other frameworks, and contains a lot of the code that > is jointly used by the fbtl, fcoll and the sharedfp components (fs to a > lesser extent as well). > > > > Before I start to move code around however, just want to confirm two > things: > > > > 1. I can move some of functionality of ompio to the base of various > frameworks (fcoll, fbtl and io). Just want to confirm that this will work, > e.g. I can call without restrictions a function of the fcoll base from an > fbtl or the io component. > > Yes - the base functions of any framework are contained in the core > library and thus always available. > These functions will be available to any module in the application, and will increase the size of the main Open MPI library. We had similar problems in the PML V, and we decided to try to minimize the increase in size of the main library. Thus, instead of moving everything in the base, we added a structure in the base that will contain all the pointer to the functions we would need. This structure is only initialized when our main module is loaded, and all sub-modules will use this structure to get access to the pointers provided. George. > > > > 2. I will have to extend the io framework interfaces a bit ( I will try > to minimize the number of new function as much as I can), but those > function pointers will be NULL for ROMIO. Just want to make sure this is ok > with everybody. > > I’ll have to let others chime in here, but that would seem to fit the OMPI > architecture. > > > > > Thanks > > Edgar > > > > On 11/25/2014 11:43 AM, Ralph Castain wrote: > >> > >>> On Nov 25, 2014, at 9:36 AM, Edgar Gabriel <gabr...@cs.uh.edu> wrote: > >>> > >>> On 11/25/2014 11:31 AM, Ralph Castain wrote: > >>>> > >>>>> On Nov 25, 2014, at 8:24 AM, Edgar Gabriel <gabr...@cs.uh.edu > >>>>> <mailto:gabr...@cs.uh.edu>> wrote: > >>>>> > >>>>> On 11/25/2014 10:18 AM, Ralph Castain wrote: > >>>>>> Hmmm…no, nothing has changed with regard to declspec that I know > >>>>>> about. I’ll ask the obvious things to check: > >>>>>> > >>>>>> * does that component have the proper include to find this function? > >>>>>> Could be that it used to be found thru some chain, but the chain is > >>>>>> now broken and it needs to be directly included > >>>>> > >>>>> header is included, I double checked. > >>>>> > >>>>>> * is that function in the base code, or down in a component? If the > >>>>>> latter, then that’s a problem, but I’m assuming you didn’t make that > >>>>>> mistake. > >>>>> > >>>>> > >>>>> I am not sure what you mean. The function is in a component, but I am > >>>>> not aware that it is illegal to call a function of a component from > >>>>> another component. > >>>> > >>>> > >>>> Of course that is illegal - you can only access a function via the > >>>> framework interface, not directly. You have no way of knowing that the > >>>> other component has been loaded. Doing it directly violates the > >>>> abstraction rules. > >>> > >>> well, ok. I know that the other componen has been loaded because that > component triggered the initialization of these sub-frameworks. > >> > >> I think we’ve seen that before, and run into problems with that > approach (i.e., components calling framework opens). > >> > >>> > >>> I can move that functionality to the base, however, none of the 20+ > functions are required for the other components of the io framework (i.e. > ROMIO). So I would basically add functionality required for one component > only into the base. > >> > >> Sounds like you’ve got an abstraction problem. If the fcoll component > requires certain functions from another framework, then the framework > should be exposing those APIs. If ROMIO doesn’t provide them, then it needs > to return an error if someone attempts to call it. > >> > >> You are welcome to bring this up on next week’s call if you like. IIRC, > this has come up before when people have tried this hard links between > components. Maybe someone else will have a better solution, but is just > seems to me like you have to go thru the framework to avoid the problem. > >> > >>> > >>> Nevertheless, I think the original question is still valid. We did not > see this problem before, but it is now showing on all of our platforms, and > I am still wandering that is the case. I *know* that the ompio component is > loaded, and I still get the error message about the missing symbol from the > ompio component. I do not understand why that happens. > >> > >> Probably because the fcoll component didn’t explicitly link against the > ompio component. You were likely getting away with it out of pure luck. > >> > >>> > >>> > >>> Thanks > >>> Edgar > >>> > >>>> > >>>> > >>>>> > >>>>> Thanks > >>>>> Edgar > >>>>> > >>>>> > >>>>> > >>>>>> > >>>>>> > >>>>>>> On Nov 25, 2014, at 8:07 AM, Edgar Gabriel <gabr...@cs.uh.edu > >>>>>>> <mailto:gabr...@cs.uh.edu>> > >>>>>>> wrote: > >>>>>>> > >>>>>>> Has something changed recently on the trunk/master regarding > >>>>>>> OMPI_DECLSPEC? The reason I ask is because we get now errors about > >>>>>>> unresolved symbols, e.g. > >>>>>>> > >>>>>>> symbol lookup error: > >>>>>>> /home/gabriel/OpenMPI/lib64/openmpi/mca_fcoll_dynamic.so: undefined > >>>>>>> symbol: ompi_io_ompio_decode_datatype > >>>>>>> > >>>>>>> > >>>>>>> and that problem was not there roughly two weeks back the last time > >>>>>>> I tested. I did verify that the the function listed there has an > >>>>>>> OMPI_DECLSPEC before its definition. > >>>>>>> > >>>>>>> Thanks Edgar -- Edgar Gabriel Associate Professor Parallel Software > >>>>>>> Technologies Lab http://pstl.cs.uh.edu Department of Computer > >>>>>>> Science University of Houston Philip G. Hoffman Hall, Room > >>>>>>> 524 Houston, TX-77204, USA Tel: +1 (713) 743-3857 > >>>>>>> Fax: +1 (713) 743-3335 > >>>>>>> _______________________________________________ devel mailing list > >>>>>>> de...@open-mpi.org <mailto:de...@open-mpi.org> Subscription: > >>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel Link to this > >>>>>>> post: > >>>>>>> http://www.open-mpi.org/community/lists/devel/2014/11/16332.php > >>>>>> > >>>>>> _______________________________________________ devel mailing list > >>>>>> de...@open-mpi.org <mailto:de...@open-mpi.org>Subscription: > >>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/develLink to this > post: > >>>>>> http://www.open-mpi.org/community/lists/devel/2014/11/16333.php > >>>>>> > >>>>> > >>>>> -- > >>>>> Edgar Gabriel > >>>>> Associate Professor > >>>>> Parallel Software Technologies Lab http://pstl.cs.uh.edu > >>>>> <http://pstl.cs.uh.edu/> > >>>>> Department of Computer Science University of Houston > >>>>> Philip G. Hoffman Hall, Room 524 Houston, TX-77204, USA > >>>>> Tel: +1 (713) 743-3857 Fax: +1 (713) 743-3335 > >>>>> _______________________________________________ > >>>>> devel mailing list > >>>>> de...@open-mpi.org <mailto:de...@open-mpi.org> > >>>>> Subscription:http://www.open-mpi.org/mailman/listinfo.cgi/devel > >>>>> Link to this > >>>>> post:http://www.open-mpi.org/community/lists/devel/2014/11/16334.php > >>>> > >>>> > >>>> > >>>> _______________________________________________ > >>>> devel mailing list > >>>> de...@open-mpi.org > >>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > >>>> Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/11/16336.php > >>>> > >>> > >>> -- > >>> Edgar Gabriel > >>> Associate Professor > >>> Parallel Software Technologies Lab http://pstl.cs.uh.edu > >>> Department of Computer Science University of Houston > >>> Philip G. Hoffman Hall, Room 524 Houston, TX-77204, USA > >>> Tel: +1 (713) 743-3857 Fax: +1 (713) 743-3335 > >>> _______________________________________________ > >>> devel mailing list > >>> de...@open-mpi.org > >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > >>> Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/11/16338.php > >> > >> _______________________________________________ > >> devel mailing list > >> de...@open-mpi.org > >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > >> Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/11/16339.php > >> > > > > -- > > Edgar Gabriel > > Associate Professor > > Parallel Software Technologies Lab http://pstl.cs.uh.edu > > Department of Computer Science University of Houston > > Philip G. Hoffman Hall, Room 524 Houston, TX-77204, USA > > Tel: +1 (713) 743-3857 Fax: +1 (713) 743-3335 > > _______________________________________________ > > devel mailing list > > de...@open-mpi.org > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/11/16358.php > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/11/16359.php >