I thought slurmctld is only meant to run on the head node?  The clients just 
run slurmd?

> On Mar 12, 2016, at 9:32 AM, Jagga Soorma <[email protected]> wrote:
> 
> 
> Hi Guys,
> 
> I have successfully installed slurm 15.08 on a small test cluster
> running CentOS 7.1.  Everything seems like it is running fine and I
> can submit jobs without any issues.  However on the clients I am
> seeing some errors on the systemctl status slurm command that don't
> make sense:
> 
> --
> # systemctl status slurm
> slurm.service - LSB: slurm daemon management
>   Loaded: loaded (/etc/rc.d/init.d/slurm)
>   Active: failed (Result: timeout) since Sat 2016-03-12 09:14:22 PST; 7min ago
> 
> Mar 12 09:12:20 client1 slurmd[123729]: _run_prolog: run job script took 
> usec=4
> Mar 12 09:12:20 client1 slurmd[123729]: _run_prolog: prolog with lock
> for job 6 ran for 0 seconds
> Mar 12 09:12:20 client1 slurmstepd[126069]: done with job
> Mar 12 09:12:30 client1 slurmd[123729]: launch task 7.0 request from
> [email protected] (port 42986)
> Mar 12 09:12:30 client1 slurmd[123729]: _run_prolog: run job script took 
> usec=4
> Mar 12 09:12:30 client1 slurmd[123729]: _run_prolog: prolog with lock
> for job 7 ran for 0 seconds
> Mar 12 09:12:30 client1 slurmstepd[126100]: done with job
> Mar 12 09:14:22 client1 systemd[1]: slurm.service operation timed out.
> Terminating.
> Mar 12 09:14:22 client1 systemd[1]: Failed to start LSB: slurm daemon
> management.
> Mar 12 09:14:22 client1 systemd[1]: Unit slurm.service entered failed state.
> --
> 
> However slurm seems to be working fine:
> 
> --
> # sinfo -lNe
> Sat Mar 12 09:25:40 2016
> NODELIST            NODES PARTITION       STATE CPUS    S:C:T MEMORY
> TMP_DISK WEIGHT FEATURES REASON
> client[1-10]      8 dev*        idle   40   2:10:2 257680        0
> 1   (null) none
> # srun hostname
> client1
> --
> 
> Any ideas why the slurm service in the client might be throwing those
> timed out errors?
> 
> Thanks!

Reply via email to