I knew it was something obvious. My slurm.conf was copied from RPM install that had PluginDir set to /usr/lib64/slurm. Setting that value to point to /apps/slurm/14.11.6/lib/slurm fixed the problem. Excuse the noise.
- Trey ============================= Trey Dockendorf Systems Analyst I Texas A&M University Academy for Advanced Telecommunications and Learning Technologies Phone: (979)458-2396 Email: [email protected] Jabber: [email protected] On Thu, Apr 30, 2015 at 11:32 AM, Trey Dockendorf <[email protected]> wrote: > In an attempt to allow myself and users to test 14.11.6 before we update > our 14.03.10 installation I've installed SLURM to our apps repository and > created a loadable module to access the test instance of SLURM. When I > execute sbatch commands I get the following: > > $ sbatch mhd-test.slrm > sbatch: error: Couldn't load specified plugin name for select/alps: Plugin > missing a required symbol use debug3 to see > sbatch: error: Couldn't load specified plugin name for select/serial: > Plugin missing a required symbol use debug3 to see > sbatch: error: Couldn't load specified plugin name for select/cons_res: > Plugin missing a required symbol use debug3 to see > sbatch: error: Couldn't load specified plugin name for select/bluegene: > Plugin missing a required symbol use debug3 to see > sbatch: error: Couldn't load specified plugin name for select/linear: > Plugin missing a required symbol use debug3 to see > sbatch: error: Couldn't load specified plugin name for select/cray: Plugin > missing a required symbol use debug3 to see > sbatch: fatal: Can't find plugin for select/cons_res > > The test slurmctld has debug3 enabled and prints this: > > [2015-04-30T11:23:30.304] debug: _slurm_recv_timeout at 0 of 4, recv zero > bytes > [2015-04-30T11:23:30.304] error: slurm_receive_msg: Zero Bytes were > transmitted or received > [2015-04-30T11:23:30.314] error: slurm_receive_msg: Zero Bytes were > transmitted or received > > The controller + slurmdbd are already on 14.11.6 as are the nodes in this > test cluster. > > I built this test install of SLURM from source. Our production install of > SLURM is via RPM. I build our RPMs in mock, so the steps I took to build > this version from source were taken from the steps used during rpmbuild in > mock. > > CFLAGS='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions > -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic' \ > CXXFLAGS='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions > -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic' \ > FFLAGS='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions > -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic > -I/usr/lib64/gfortran/modules' \ > ./configure --build=x86_64-redhat-linux-gnu \ > --host=x86_64-redhat-linux-gnu \ > --target=x86_64-redhat-linux-gnu \ > --program-prefix= \ > --prefix=/apps/slurm/14.11.6 > > make && make install > > I copied our 14.11.6 test config into /apps/slurm/14.11.6/etc. These are > environment variables being set by the loaded module: > > SLURM_CONF=/apps/slurm/14.11.6/etc/slurm.conf > PATH=/apps/slurm/14.11.6/bin:$PATH > MANPATH=/apps/slurm/14.11.6/share/man:$MANPATH > > LD_LIBRARY_PATH=/apps/slurm/14.11.6/lib/slurm:/apps/slurm/14.11.6/lib:$LD_LIBRARY_PATH > > LIBRARY_PATH=/apps/slurm/14.11.6/lib/slurm:/apps/slurm/14.11.6/lib:$LIBRARY_PATH > > I feel like I'm missing something obvious that results in the plugins > failing to load. An FAQ entry [1] looks similar but unsure if this is the > same problem as described there. > > [1]: http://slurm.schedmd.com/faq.html#inc_plugin > > Thanks, > - Trey > > ============================= > > Trey Dockendorf > Systems Analyst I > Texas A&M University > Academy for Advanced Telecommunications and Learning Technologies > Phone: (979)458-2396 > Email: [email protected] > Jabber: [email protected] >
