Quoting Chi Shin Hsu <[email protected]>: > Hi, > > The slurm default resource policy is exclusive mode for all resource. > > Can I use other to control my resource? For example, one node can execute > 1000 jobs. > > Because my tasks are not use much cpu resource but waste time. > > I have tried to use fast schedule and set 1000 cpus per node. > > It works! But is it the only way to do this?
This is the best way. You could also configure the partition's "Shared" parameter, but that would have higher overhead. > Another question, I can only execute 100 jobs per node, and the remained > jobs stay in PD state because of resource. There seems to be a bug preventing more than 100 jobs per node. See fix here to increase limit to 10000 jobs: https://github.com/SchedMD/slurm/commit/952401ec2a56da2b005e6870a0121ccb391736bc > I can see there are non-allocated cpu resource by scontrol command. > > NodeName=node5 Arch=x86_64 CoresPerSocket=300 > CPUAlloc=100 CPUErr=0 CPUTot=1200 Features=(null) > Gres=(null) > NodeAddr=node5 NodeHostName=node5 > OS=Linux RealMemory=3000 Sockets=4 > State=MIXED ThreadsPerCore=1 TmpDisk=0 Weight=1 > BootTime=2012-05-29T22:42:24 SlurmdStartTime=2012-06-02T20:56:19 > Reason=(null) > > I have changed MaxTasksPerNode to 2000 in the configure file, but it's not > working. > > Thanks. > > Best regards, > > Shin >
