[slurm-dev] prioritize based on walltime request

2016-10-21 Thread Steven Lo



Hi,

How to configure slurm such that the job with shortest wall time will 
run first in the queue?
Doesn't look like any of the priority setting in slurm.conf related to 
the wall time.


We have one queue which has max walltime of 1 hour.  We would like to 
let job

which request 30 minutes to run before the 1 hour job.

Thanks in advance.

Steven.


[slurm-dev] Re: set maximum CPU usage per user

2016-10-21 Thread Steven Lo



Is MaxTRESPerUser a better option to use?

Steven.


On 10/20/16 10:21 AM, Steven Lo wrote:



Hi Benjamin,

We have the following set in slurm.conf as you have suggested:

AccountingStorageEnforce=limits,qos
PriorityWeightQOS=1000

And we did

sacctmgr modify qos normal set Grpcpus=300

sacctmgr show qos format=GrpTRES
  GrpTRES
-
  cpu=200


I see that when I submit a job requesting over 200 CPUs, the job get 
blocked, which is good.
However, when I submit a job requesting just few CPUs, the job get 
blocked as well.


[slurm-testing]$ squeue
 JOBID PARTITION NAME USER ST   TIME NODES 
NODELIST(REASON)
  2064 debug hello_pa  slo PD   0:00 10 
(QOSGrpCpuLimit)
  2065  smallmem c_gth-dz jmcclain PD   0:00 6 
(QOSGrpCpuLimit)



Do you know why it thinks the job is over 200 CPU limit?  Is there 
other setting we need?



Thanks

Steven.


On 10/20/16 2:13 AM, Benjamin Redling wrote:

Hi Steven,

On 10/20/2016 00:22, Steven Lo wrote:

We have the attribute commented out:
#AccountingStorageEnforce=0

I think the best is to (re)visit "Accounting and Resource Limits":
http://slurm.schedmd.com/accounting.html

Right know I have no setup that needs accounting but as far as I
currently understand you'll need AccoutingStorageEnforce=limits,qos to
get your examples to work.
And just in case you already didn't set it:
for QOS (http://slurm.schedmd.com/qos.html)

PriorityWeightQOS" configuration parameter must be defined in the
slurm.conf file and assigned an integer value greater than zero.


What I am unsure -- esp. not knowing your config -- if there are any
other unmet dependencies.
Would be nice somebody with real experience with accounting could affirm
or give a pointer.

Regards,
Benjamin




[slurm-dev] Wrong behaviour of "--tasks-per-node" flag

2016-10-21 Thread Manuel Rodríguez Pascual
Hi all,

I am having the weirdest error ever.  I am pretty sure this is a bug. I
have reproduced the error in latest slurm commit (slurm 17.02.0-0pre2,
 commit 406d3fe429ef6b694f30e19f69acf989e65d7509 ) and in slurm 16.05.5
branch. It does NOT happen in slurm 15.08.12 .

My cluster is composed by 8 nodes, each with 2 sockets, each with 8 cores.
Slurm.conf content is

SchedulerType=sched/backfill
SchedulerPort=7321
SelectType=select/linear  #DEDICATED NODES
NodeName=acme[11-14,21-24] CPUs=16 Sockets=2 CoresPerSocket=8
ThreadsPerCore=1 State=UNKNOWN

I am running a simple hello World parallel code. It is submitted as "sbatch
--ntasks=X --tasks-per-node=Y myScript.sh ". The problem is that, depending
on the values of X and Y, Slurm performs a wrong opperation and returns an
error.

"
sbatch --ntasks=8 --tasks-per-node=2 myScript.sh
srun: Warning: can't honor --ntasks-per-node set to 2 which doesn't match
the requested tasks 4 with the number of requested nodes 4. Ignoring
--ntasks-per-node.
"
Note that  I did not request 4 but 8 tasks, and I did not request any
number of nodes.  Same happens with
"
sbatch --ntasks=16 --tasks-per-node=2 myScript.sh
srun: Warning: can't honor --ntasks-per-node set to 2 which doesn't match
the requested tasks 8 with the number of requested nodes 8. Ignoring
--ntasks-per-node.
"
and
"
sbatch --ntasks=32 --tasks-per-node=4 myScript.sh
srun: Warning: can't honor --ntasks-per-node set to 4 which doesn't match
the requested tasks 8 with the number of requested nodes 8. Ignoring
--ntasks-per-node.
"
All the rest of configurations work correctly and do not return any error.
In particular, I have tried the following combinations with no problem:
(ntasks, tasks-per-node)
(1,1)
(2,1), (2,2)
(4,1), (4,2), (4,4)
(8,1), (4,4), (8,8)
(16,4), (16,8), (16,16)
(32,8), (32,16)
(64,8), (64, 16)
(128, 16)

As said, this does not happen when executing the very same commands and
scripts with slurm 15.08.12. So, have you had any similar experiences? Is
this a bug, a desired behaviour, or am I doing something wrong?

Thanks for your help.

Best regards,



Manuel


[slurm-dev] Job dependency across other partitiions

2016-10-21 Thread Naoki SHIBATA (XD)
Hi Slurm Folks,

We want to CPU+GPU cluster which have two partitions for CPU only partition
and GPU Only partition.
The customer want to run dependency jobs GPU only job + CPU only job.

do you have any good solution of dependency job across 2partitions?

if anyone have good idea or suggestions, please let me.

Regards,
Naoki.

Naoki SHIBATA