Because I guess it is declared by another module loaded dynamically at runtime. As libtool load the symbols not in a global scope, this mca_pml_v will not be visible for other modules trying to use it.
george. On Mar 5, 2010, at 14:35 , Leonardo Fialho wrote: > No George, this trick does not change the problem. I'm looking for the > problem in the mca_pml_v declaration, but I still can't figure out the reason > why it doesn't work. > > Leonardo > > On Mar 5, 2010, at 8:12 PM, George Bosilca wrote: > >> I would first try the Open MPI configure option --disable-visibility. If >> this doesn't fix it, you should make sure that dlopen is called with the >> GLOBAL flag on (don't remember where exactly in the code and unfortunately I >> can't check right now). Use gdb and set a breakpoint to dlopen and you will >> find it. >> >> george. >> >> On Mar 5, 2010, at 14:00 , Leonardo Fialho wrote: >> >>> Yeah, probably ompi_request_null and opal_output are not good candidates. >>> I'm trying with mca_pml_v. But I'm not familiarized with this framework >>> although it is really small. >>> >>> George, you said to change this (opal/mca/base/mca_base_component_find.c): >>> >>> #if OPAL_HAVE_LTDL_ADVISE >>> component_handle = lt_dlopenadvise(target_file->filename, >>> opal_mca_dladvise); >>> #else >>> component_handle = lt_dlopenext(target_file->filename); >>> #endif >>> >>> to use lt_dladvise_global instead of lt_dladvise_local? >>> >>> Leonardo >>> >>> On Mar 5, 2010, at 7:47 PM, Terry Dontje wrote: >>> >>>> I would also start nm'ing the .so's you think the U symbols are resolved >>>> in to make sure they are exposed. Luckily you only have 3 symbols to look >>>> for. >>>> >>>> --td >>>> >>>> Ralph Castain wrote: >>>>> It's probably a visibility issue - check for an OMPI_DECLSPEC missing >>>>> from the declaration of a symbol. >>>>> >>>>> On Mar 5, 2010, at 11:40 AM, Leonardo Fialho wrote: >>>>> >>>>> >>>>>> Yes, >>>>>> >>>>>> I renamed all references to Aurelien's componant name and removed all >>>>>> code regarding to the component itself. There are only functions which >>>>>> returns OMPI_SUCCESS. No other function is called. >>>>>> >>>>>> I'm debugging with LD_DEBUG=symbols, but the output is really huge! >>>>>> Probably the error is in the mca_pml_v symbol: >>>>>> >>>>>> 19643: /home/lfialho/lib/openmpi/mca_vprotocol_receiver.so: error: >>>>>> symbol lookup error: undefined symbol: mca_pml_v (fatal) >>>>>> >>>>>> Leonardo >>>>>> >>>>>> On Mar 5, 2010, at 7:35 PM, Ralph Castain wrote: >>>>>> >>>>>> >>>>>>> You said this component was a copy of Aurelien's component? Did you >>>>>>> rename the critical elements (e.g., component, module) inside it to >>>>>>> avoid name confusion? >>>>>>> >>>>>>> On Mar 5, 2010, at 11:27 AM, Leonardo Fialho wrote: >>>>>>> >>>>>>> >>>>>>>> I see... but it is really strange because this module is clean, it >>>>>>>> does not use nothing. This is the output of the nm command, I can't >>>>>>>> see any symbol which is not available. >>>>>>>> >>>>>>>> [lfialho@aoclsb-clus openmpi]$ nm mca_vprotocol_receiver.so >>>>>>>> 0000000000201208 a _DYNAMIC >>>>>>>> 0000000000201408 a _GLOBAL_OFFSET_TABLE_ >>>>>>>> w _Jv_RegisterClasses >>>>>>>> 00000000002011e0 d __CTOR_END__ >>>>>>>> 00000000002011d8 d __CTOR_LIST__ >>>>>>>> 00000000002011f0 d __DTOR_END__ >>>>>>>> 00000000002011e8 d __DTOR_LIST__ >>>>>>>> 00000000000011d0 r __FRAME_END__ >>>>>>>> 00000000002011f8 d __JCR_END__ >>>>>>>> 00000000002011f8 d __JCR_LIST__ >>>>>>>> 0000000000201640 A __bss_start >>>>>>>> w __cxa_finalize@@GLIBC_2.2.5 >>>>>>>> 0000000000000d40 t __do_global_ctors_aux >>>>>>>> 00000000000007c0 t __do_global_dtors_aux >>>>>>>> 0000000000201200 d __dso_handle >>>>>>>> w __gmon_start__ >>>>>>>> 0000000000201640 A _edata >>>>>>>> 0000000000201648 A _end >>>>>>>> 0000000000000d78 T _fini >>>>>>>> 0000000000000750 T _init >>>>>>>> 00000000000007a0 t call_gmon_start >>>>>>>> 0000000000201640 b completed.6115 >>>>>>>> 0000000000000810 t frame_dummy >>>>>>>> U mca_pml_v >>>>>>>> 0000000000201460 D mca_vprotocol_receiver >>>>>>>> 0000000000000c71 t mca_vprotocol_receiver_add_comm >>>>>>>> 0000000000000a5f t mca_vprotocol_receiver_add_procs >>>>>>>> 0000000000201540 D mca_vprotocol_receiver_component >>>>>>>> 0000000000000cc3 t mca_vprotocol_receiver_component_close >>>>>>>> 0000000000000d18 t mca_vprotocol_receiver_component_finalize >>>>>>>> 0000000000000cce t mca_vprotocol_receiver_component_init >>>>>>>> 0000000000000cb8 t mca_vprotocol_receiver_component_open >>>>>>>> 0000000000000c93 t mca_vprotocol_receiver_del_comm >>>>>>>> 0000000000000a89 t mca_vprotocol_receiver_del_procs >>>>>>>> 000000000000083c t mca_vprotocol_receiver_dump >>>>>>>> 0000000000000d23 t mca_vprotocol_receiver_enable >>>>>>>> 00000000000009e7 t mca_vprotocol_receiver_iprobe >>>>>>>> 0000000000000b9a t mca_vprotocol_receiver_irecv >>>>>>>> 0000000000000ab3 t mca_vprotocol_receiver_isend >>>>>>>> 0000000000000a29 t mca_vprotocol_receiver_probe >>>>>>>> 0000000000000c00 t mca_vprotocol_receiver_recv >>>>>>>> 0000000000000b21 t mca_vprotocol_receiver_send >>>>>>>> 00000000000009bd T mca_vprotocol_receiver_start >>>>>>>> 0000000000000864 t mca_vprotocol_receiver_test >>>>>>>> 0000000000000896 t mca_vprotocol_receiver_test_all >>>>>>>> 00000000000008d0 t mca_vprotocol_receiver_test_any >>>>>>>> 0000000000000950 t mca_vprotocol_receiver_test_some >>>>>>>> 0000000000000916 t mca_vprotocol_receiver_wait_any >>>>>>>> 000000000000098a t mca_vprotocol_receiver_wait_some >>>>>>>> U ompi_request_null >>>>>>>> U opal_output >>>>>>>> 0000000000201440 d p.6113 >>>>>>>> [lfialho@aoclsb-clus openmpi]$ >>>>>>>> >>>>>>>> On Mar 5, 2010, at 7:00 PM, Terry Dontje wrote: >>>>>>>> >>>>>>>> >>>>>>>>> Sorry meant to add this, but you might be able to try and find the >>>>>>>>> symbol causing the issue by twiddling with LD_DEBUG >>>>>>>>> >>>>>>>>> --td >>>>>>>>> Terry Dontje wrote: >>>>>>>>> >>>>>>>>>> Possibly there is an external symbol in the .so that is being loaded >>>>>>>>>> that cannot be resolved. >>>>>>>>>> --td >>>>>>>>>> Leonardo Fialho wrote: >>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> I know that libtool does not help us to find the source of this >>>>>>>>>>> error, but, what can generate the following error? >>>>>>>>>>> >>>>>>>>>>> [aoclsb-clus.uab.es:11724] mca: base: component_find: unable to >>>>>>>>>>> open /home/lfialho/lib/openmpi/mca_vprotocol_receiver: perhaps a >>>>>>>>>>> missing symbol, or compiled for a different version of Open MPI? >>>>>>>>>>> (ignored) >>>>>>>>>>> >>>>>>>>>>> 1) yes, the file exists >>>>>>>>>>> 2) yes, it has been compiled among all other components >>>>>>>>>>> 3) yes, it is the same Open MPI version >>>>>>>>>>> 4) this component is a copy of the pessimist component implemented >>>>>>>>>>> by Aurelien >>>>>>>>>>> 5) Aurelien's component presents the same error >>>>>>>>>>> >>>>>>>>>>> The question is: what mistake should generate an error during >>>>>>>>>>> module loading? >>>>>>>>>>> >>>>>>>>>>> Thanks in advance, >>>>>>>>>>> Leonardo >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> devel mailing list >>>>>>>>>>> de...@open-mpi.org >>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> devel mailing list >>>>>>>>>> de...@open-mpi.org >>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> devel mailing list >>>>>>>>> de...@open-mpi.org >>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> devel mailing list >>>>>>>> de...@open-mpi.org >>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>> >>>>>>> _______________________________________________ >>>>>>> devel mailing list >>>>>>> de...@open-mpi.org >>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>> >>>>>> _______________________________________________ >>>>>> devel mailing list >>>>>> de...@open-mpi.org >>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> devel mailing list >>>>> de...@open-mpi.org >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>> >>>> >>>> _______________________________________________ >>>> devel mailing list >>>> de...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> >>> >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel