Another option is to simply add the -lslurm -lauth flags to the pmix/s1 component as this is the only place that requires it, and it won’t hurt anything to do so.
> On Dec 1, 2014, at 6:03 PM, Gilles Gouaillardet > <gilles.gouaillar...@iferc.org> wrote: > > Jeff, > > FWIW, you can read my analysis of what is going wrong at > http://www.open-mpi.org/community/lists/pmix-devel/2014/11/0293.php > <http://www.open-mpi.org/community/lists/pmix-devel/2014/11/0293.php> > > bottom line, i agree this is a slurm issue (slurm plugin should depend > on libslurm, but they do not, yet) > > a possible workaround would be to make the pmi component a "proxy" that > dlopen with RTLD_GLOBAL the "real" component in which the job is done. > that being said, the impact is quite limited (no direct launch in slurm > with pmi1, but pmi2 works fine) so it makes sense not to work around > someone else problem. > and that being said, configure could detect this broken pmi1 and not > build pmi1 support or print a user friendly error message if pmi1 is used. > > any thoughts ? > > Cheers, > > Gilles > > On 2014/12/02 7:47, Jeff Squyres (jsquyres) wrote: >> Ok, if the problem is moot, great. >> >> (sidenote: this is moot, so ignore this if you want: with this explanation, >> I'm still not sure how RTLD_GLOBAL fixes the issue) >> >> >> On Dec 1, 2014, at 5:15 PM, Ralph Castain <r...@open-mpi.org> wrote: >> >>> Easy enough to explain. We link libpmi into the pmix/s1 component. This >>> library is missing the linkage to libslurm that contains the linkage to >>> libauth where munge resides. So when we call a PMI function, libpmi >>> references a call to munge for authentication and hits an “unresolved >>> symbol” error. >>> >>> Moe acknowledges the error is in Slurm and is fixing the linkages so this >>> problem goes away >>> >>> >>>> On Dec 1, 2014, at 2:13 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> >>>> wrote: >>>> >>>> On Dec 1, 2014, at 5:07 PM, Ralph Castain <r...@open-mpi.org> wrote: >>>> >>>>> FWIW: It’s Slurm’s pmi-1 library that isn’t linked correctly against its >>>>> dependencies (the pmi-2 one is correct). Moe is aware of the problem and >>>>> fixing it on their side. This won’t help existing installations until >>>>> they upgrade, but I tend to agree with Jeff about not fixing other >>>>> people’s problems. >>>> Can you explain what is happening? >>>> >>>> I ask because I'm not sure I understand the problem such that using >>>> RTLD_GLOBAL would fix it. I.e., even if libpmi1.so isn't linked against >>>> its dependencies properly, that shouldn't cause a problem if OMPI >>>> components A and B are both linked against libpmi1.so, and then A is >>>> loaded, and then B is loaded. >>>> >>>> ...or perhaps we can just discuss this on the call tomorrow? >>>> >>>> -- >>>> Jeff Squyres >>>> jsquy...@cisco.com >>>> For corporate legal information go to: >>>> http://www.cisco.com/web/about/doing_business/legal/cri/ >>>> >>>> _______________________________________________ >>>> devel mailing list >>>> de...@open-mpi.org >>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>> Link to this post: >>>> http://www.open-mpi.org/community/lists/devel/2014/12/16383.php >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> Link to this post: >>> http://www.open-mpi.org/community/lists/devel/2014/12/16384.php >> > > _______________________________________________ > devel mailing list > de...@open-mpi.org <mailto:de...@open-mpi.org> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > <http://www.open-mpi.org/mailman/listinfo.cgi/devel> > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/12/16386.php > <http://www.open-mpi.org/community/lists/devel/2014/12/16386.php>