[slurm-dev] Need for recompiling openmpi built with --with-pmi?

2015-04-16 Thread Bjørn-Helge Mevik
We are considering compiling openmpi with --with-pmi=/opt/slurm to enable running mpi jobs with srun. If we do this, will we have to recompile openmpi and/or programs built with openmpi when we upgrade slurm? (If so, only for major upgrades, or for minor upgrades as well?) -- Regards,

[slurm-dev] Re: Need for recompiling openmpi built with --with-pmi?

2015-04-16 Thread Uwe Sauter
Hi, I have the case that OpenMPI was built against Slurm 14.03 (which provided libslurm.so.27). Since upgrading to 14.11 I get errors like: [controller:35605] mca: base: component_find: unable to open /opt/apps/openmpi/1.8.1/gcc/4.9/0/lib/openmpi/mca_ess_pmi: libslurm.so.27: cannot open shared

[slurm-dev] Re: Need for recompiling openmpi built with --with-pmi?

2015-04-16 Thread Ralph Castain
No, you shouldn't have to do so - it's a dynamic library that gets picked up at execution On Thu, Apr 16, 2015 at 2:55 AM, Bjørn-Helge Mevik b.h.me...@usit.uio.no wrote: We are considering compiling openmpi with --with-pmi=/opt/slurm to enable running mpi jobs with srun. If we do this,

[slurm-dev] Re: Need for recompiling openmpi built with --with-pmi?

2015-04-16 Thread Ralph Castain
Hmmm...yeah, it sounds like Slurm changed it's library names and/or dependencies. I'm afraid that you do indeed need to recompile OMPI in that case. You probably need to rerun configure as well, just to be safe. Sorry - outside OMPI's control :-/ On Thu, Apr 16, 2015 at 5:22 AM, Uwe Sauter

[slurm-dev] Re: slurmdbd config for two clusters with different munge keys

2015-04-16 Thread Maciej L. Olchowik
You have to start a separate munge daemon on the cluster that uses the key being used by the slurmdbd. Then you put that munge's socket path in AccountingStoragePass= in the cluster's slurm.conf. So one munge daemon provides authentication for the cluster and another for communication

[slurm-dev] Re: Need for recompiling openmpi built with --with-pmi?

2015-04-16 Thread Uwe Sauter
Hi Ralph, beside the mentioned libslurm.so.28 there is also a libslurm.so pointing to the same libslurm.so.28.0.0 file. Perhaps OpenMPI could use this link instead of the versioned on? File list in slurm/lib directory: -rw-r--r-- 1 slurm slurm 68992 Mar 20 11:39 libpmi.a -rwxr-xr-x 1 slurm

[slurm-dev] Multi-Cluster installation update-safe?

2015-04-16 Thread Ulf Markwardt
Dear Slurm developers, before I set up Slurm with multi-cluster support, I would like to make sure that it will be update-safe: It certainly will happen that I can only update the Slurm installation on _one_ cluster at a time, maybe with e.g. a week in between. Is it a design principle that

[slurm-dev] Re: Need for recompiling openmpi built with --with-pmi?

2015-04-16 Thread Ralph Castain
To be clear, we aren't linking to libslurm at all. The issue is that libpmi is linking to it, and we link to libpmi. So I think you have to recompile to get the link dependencies correctly setup. On Thu, Apr 16, 2015 at 5:32 AM, Uwe Sauter uwe.sauter...@gmail.com wrote: Hi Ralph, beside the

[slurm-dev] Re: Need for recompiling openmpi built with --with-pmi?

2015-04-16 Thread Andy Riebs
It has been our experience that it is necessary to rebuild OpenMPI for each major slurm release, such as transitioning from Slurm 14.03.x to 14.11.x. Andy On 04/16/2015 07:49 AM, Ralph Castain wrote: Re: [slurm-dev] Re: Need for recompiling openmpi built with --with-pmi?

[slurm-dev] Re: Need for recompiling openmpi built with --with-pmi?

2015-04-16 Thread Andy Riebs
Recompiling openmpi is sufficient, unless something else has changed in openmpi that might require your programs to be rebuilt. On 04/16/2015 09:17 AM, Bjørn-Helge Mevik wrote: Andy Riebs andy.ri...@hp.com writes: It has been our experience that it is necessary to rebuild OpenMPI for

[slurm-dev] Re: Question about prologging

2015-04-16 Thread John Desantis
To all involved in this thread, Thank you very much for your pointers and suggestions! John DeSantis 2015-04-16 1:07 GMT-04:00 Christopher Samuel sam...@unimelb.edu.au: On 16/04/15 14:43, Bill Barth wrote: That's what I sent John off-list. Wasn't sure self-promotion was OK here. I can't

[slurm-dev] Re: Need for recompiling openmpi built with --with-pmi?

2015-04-16 Thread Bjørn-Helge Mevik
Andy Riebs andy.ri...@hp.com writes: It has been our experience that it is necessary to rebuild OpenMPI for each major slurm release, such as transitioning from Slurm 14.03.x to 14.11.x. Do you also need to recompile programs that are compiled with openmpi, or is it enough to recompile

[slurm-dev] Re: Need for recompiling openmpi built with --with-pmi?

2015-04-16 Thread Uwe Sauter
But if libpmi.so is provided by Slurm, why do I get the error messages? Does OpenMPI statically link libpmi.a which then depends on an older version of libslurm.so? If OpenMPI dynamically links agains libpmi.so which itself either links against libslurm.so or libslurm.so.28, shouldn't this

[slurm-dev] Re: Multi-Cluster installation update-safe?

2015-04-16 Thread Danny Auble
Yes Ulf, your understanding is correct. The DBD can talk to 2 previous versions. As long as the DBD is updated first you should be fine. On April 16, 2015 3:33:22 AM PDT, Ulf Markwardt ulf.markwa...@tu-dresden.de wrote: Dear Slurm developers, before I set up Slurm with multi-cluster