Hi all,
first of all let me apologise for cross-posting. I've already asked
this question at torque mailing list but I got not reply. As this is
torque/MAUI related I believe that asking it here is not list off-topic.
We were testing how to limit and request resource usage with torque.
Doc, and some docs I found on the net, said that defining resources_max
at queue level is enough for limitng resource usage:
* pag 62 of torque doc v 3.0.0
resource_max
Specifies the maximum resource limits for jobs submitted to the queue
So, we did something like :
resources_max.vmem=6gb
Also, after configuring 'size [fs=/home]' on all nodes, we added some
default resource request (disk free space) at submitfilter level:
line="#PBS -l file=30gb -c n"
from mnan:
-l resource_list
Defines the resources that are required by the job
and establishes a limit to the amount of resource that can be consumed.
jobs were submitted with :
Resource_List.file = 30gb
Resource_List.neednodes = 1
Resource_List.nodect = 1
Resource_List.nodes = 1
Resource_List.pvmem = 6000mb
which seemed to work fine, but after some jobs started running, we
noticed that nodes were not running all the jobs they were supposed to,
although being in free state.
I.e, a node with 24gb os mem (PHYS+SWAP) using only 12gb of mem did not
run more than 4 jobs when 8 was its limit. So, if it had free resources
why is it not running more jobs?
After some debugging we found the source. MAUI was reserving 6gb of mem
for each job. so, 4 jobs*6gb of mem = 24gb. All the mem was reserved
for those 4 jobs and the node is not selected for running more.
from checknode:
[...]
Configured Resources: PROCS: 8 MEM: 15G SWAP: 23G DISK: 122G
Utilized Resources: SWAP: 5048M DISK: 35G
Dedicated Resources: PROCS: 4 SWAP: 23G DISK: 30G
[...]
And we suppose that something similar was going to happen with DISK
resource if more jobs start (yep, we have some node with low disk
space).
So, did we understand correctly the resource.max parameter and -l qsub
option? Why that maui resource reservation?
Maybe this question should go to maui list, but for not
double-posting (yet), may we avoid maui reservation of resources?
How are other admins limiting VMEM usage per job?
How may we request some disk space available?
Many thanks in advance, and specially to them who read till here ;-)
Cheers,
Arnau
_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers