[slurm-dev] RE: [Non-DoD Source] Job Array not using all nodes (UNCLASSIFIED)

Glover, Anthony E CTR USARMY RDECOM (US) Thu, 29 Jun 2017 10:51:27 -0700

CLASSIFICATION: UNCLASSIFIED

I was able to change my SelectType to cons_res and the scheduler seems to be 
using all of my nodes now. Not sure why serial did not work as that is the 
recommended high throughput setting.

Thanks,
Tony

-----Original Message-----
From: Glover, Anthony E CTR USARMY RDECOM (US) 
[mailto:[email protected]] 
Sent: Thursday, June 29, 2017 9:52 AM
To: slurm-dev <[email protected]>
Subject: [slurm-dev] RE: [Non-DoD Source] Job Array not using all nodes 
(UNCLASSIFIED)

CLASSIFICATION: UNCLASSIFIED

I just noticed that when I examined my job within sview that "Nodes Min" and 
"Nodes Max" are both set to 1 even though I had

#SBATCH -N 1-40

In my script. Maybe the Nodes Max is preventing my distribution? Anyway to get 
around that?

Thanks,
Tony

-----Original Message-----
From: Glover, Anthony E CTR USARMY RDECOM (US) 
[mailto:[email protected]] 
Sent: Thursday, June 29, 2017 8:39 AM
To: slurm-dev <[email protected]>
Subject: [Non-DoD Source] [slurm-dev] Job Array not using all nodes 
(UNCLASSIFIED)

CLASSIFICATION: UNCLASSIFIED

I've got a cluster with about 39 nodes, with 8 to 12 cores each. When I submit 
a job array job of say 15k, about 300 of those jobs start up across the 
cluster, but once those jobs complete, I only see one node's worth of jobs (say 
8) going at a time from then on and always on just one node, and I don't see 
the other nodes getting used at all. Would anyone have an idea as to why my 
other nodes don't continue to get jobs placed on them? Here are some pertinent 
settings from my slurm.conf

FastSchedule=0
SchedulerType=sched/builtin
SelectType=select/serial
SchedulerParameters=default_queue_depth=300

JobCompType=jobcomp/none
JobAcctGatherType=jobacct_gather/none

MaxJobCount=50000
TaskPlugin=task/none

InactiveLimit=0
KillWait=30
MinJobAge=2
SlurmctldTimeout=120
SlurmdTimeout=300
WaitTime=0

Thanks,
Tony

CLASSIFICATION: UNCLASSIFIED
CLASSIFICATION: UNCLASSIFIED
CLASSIFICATION: UNCLASSIFIED

[slurm-dev] RE: [Non-DoD Source] Job Array not using all nodes (UNCLASSIFIED)

Reply via email to