[slurm-dev] Re: Query number of cores allocated per node for a job

2016-10-26 Thread John DeSantis
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Kaizaad, I hate to say it, but I cannot _believe_ I never saw this detail in the man page. This information is extremely useful! John DeSantis On 10/26/2016 09:49 AM, Kaizaad Bilimorya wrote: > > Hi Chris, > > One way is to use "scontrol": >

[slurm-dev] Run maintenance job

2016-10-26 Thread r...@open-mpi.org
Hey folks This is likely a dumb question, so I appreciate your patience in advance. I need to schedule a job that takes a node down, flashes the firmware, and reboots it. I can obviously ask SLURM to allocate two nodes for me, and run my job script on the node I don’t intend to service.

[slurm-dev] Re: slurmd: fatal: Frontend not configured correctly in slurm.conf

2016-10-26 Thread Gennaro Oliva
Hi Peixin, I successfully ran slurm on my vm by modifying the following parameters in your configuration file: On Tue, Oct 25, 2016 at 03:38:47PM -0700, Peixin Qiao wrote: > SlurmctldPidFile=/var/run/slurmctld.pid SlurmctldPidFile=/var/run/slurm-llnl/slurmctld.pid >

[slurm-dev] Set Limit Time Per Job

2016-10-26 Thread Achi Hamza
Hi All, Context: SLURM: 15.08.11 OS: CentOS 7 I would like to set time limit per job, so my slurm.conf looks like this ( 3 minutes just for demo): PartitionName=test Nodes=slurm-[2-5] Default=YES MaxTime=3 State=UP sinfo output: [hachi@slurm-1 ~]$ sinfo PARTITION AVAIL TIMELIMIT

[slurm-dev] Re: Set Limit Time Per Job

2016-10-26 Thread Benjamin Redling
Hi, Am 26.10.2016 um 15:35 schrieb Achi Hamza: > But when i run a job more than 3 minutes it does not stop, like: > srun -n1 sleep 300 > > I also set MaxWall parameter but to no avail: > sacctmgr show qos format=MaxWall > MaxWall > --- > 00:03:00 > > Please advice

[slurm-dev] Re: Query number of cores allocated per node for a job

2016-10-26 Thread Kaizaad Bilimorya
Hi Chris, One way is to use "scontrol": scontrol --details show job you should then see something like: ...snip Nodes=clu[2-3] CPU_IDs=0-19 Mem=1024 Nodes=clu4 CPU_IDs=0-3,10-13 Mem=1024 Note the note: Note that the CPU ids reported by this command are Slurm abstract CPU ids, not

[slurm-dev] Re: Job steps facility as in LoadLeveler?

2016-10-26 Thread Benjamin Redling
Hi, Am 25.10.2016 um 13:20 schrieb Patrice Peterson: > is there a build-in way to queue LoadLeveler-like job steps in SLURM? > Something like this: > > #!/bin/bash > #SBATCH --num-tasks=1 > echo "prepping data, simple stuff" > #SBATCH --- END STEP > > #SBATCH