[slurm-dev] Re: node not communicating with control

2016-02-24 Thread Christopher Samuel
On 25/02/16 10:58, Berryhill, Jerome wrote: > Thanks for the quick response. Quick follow-up; Is it possible to > upgrade the slurmctld without taking down the cluster? If you are quick enough - yes. :-) >From the upgrade section I linked to before: # Be mindful of your configured

[slurm-dev] Re: node not communicating with control

2016-02-24 Thread Berryhill, Jerome
Thanks for the quick response. Quick follow-up; Is it possible to upgrade the slurmctld without taking down the cluster? -Original Message- From: Christopher Samuel [mailto:sam...@unimelb.edu.au] Sent: Wednesday, February 24, 2016 3:52 PM To: slurm-dev Subject:

[slurm-dev] Re: node not communicating with control

2016-02-24 Thread Christopher Samuel
On 25/02/16 10:41, Berryhill, Jerome wrote: >I am running slurm 14.11.3 as the master on an RHEL 7.0 machine. I > recently added an SLES 12.1 configuration to xCAT, and I am testing it > on one of our machines. It seems to be working OK, except that slurmd > does not seem to be able to

[slurm-dev] node not communicating with control

2016-02-24 Thread Berryhill, Jerome
Resending this e-mail, as it was returned because I had not registered for the mailing list. Hello, I am running slurm 14.11.3 as the master on an RHEL 7.0 machine. I recently added an SLES 12.1 configuration to xCAT, and I am testing it on one of our machines. It seems to be working OK,

[slurm-dev] FW: slurmd not talking to surmcontrol

2016-02-24 Thread Berryhill, Jerome
From: Berryhill, Jerome [mailto:jerome.berryh...@intel.com] Sent: Wednesday, February 24, 2016 2:52 PM To: slurm-dev@schedmd.com Subject: slurmd not talking to surmcontrol Hello, I am running slurm 14.11.3 as the master on an RHEL 7.0 machine. I recently added an SLES 12.1 configuration to

[slurm-dev] Re: How to enforce jobs cpu count to avoid underusing nodes

2016-02-24 Thread Bruce Roberts
On 02/24/16 12:05, Damian Montaldo wrote: Hello, sorry if this answer is in the man page but we can't find the solution We're working on a cluster that has 64 cores nodes. Since nodes are not shared but running jobs in exclusive mode, we want to require and enforce users to ask for a

[slurm-dev] How to enforce jobs cpu count to avoid underusing nodes

2016-02-24 Thread Damian Montaldo
Hello, sorry if this answer is in the man page but we can't find the solution We're working on a cluster that has 64 cores nodes. Since nodes are not shared but running jobs in exclusive mode, we want to require and enforce users to ask for a minimum number of cores. We want to avoid

[slurm-dev] Re: allocating entire nodes from within an allocation

2016-02-24 Thread Diego Zuccato
Il 23/02/2016 20:28, Craig Yoshioka ha scritto: > $ salloc -N 2 —exclusive You're allocating 2 nodes for threads originated by your user. When you use srun, you have to tell it to use 'n' threads with 'n'==ncpus. Maybe just removing "-n 1" could be enough. -- Diego Zuccato Servizi Informatici

[slurm-dev] Jobs blocked with Reason=BadConstraints when -N not specified

2016-02-24 Thread Roche Ewan
Hello, following an update to 15.08.8 from 14.11.7 we observe what appears to be a bug. We can reproduce the bug on clusters with TaskPlugin=task/affinity and TaskPlugin=task/cgroup The background is that one of our clusters has two groups of nodes with different core counts (16 and 24) so we