Hello,
I have a trouble with maui (3.2.6p14) and torque (2.0.0p7)
integration.
The problem appears when I am using resources_max.file in the queue
definition. Jobs submitted to this queue cannot be executed via maui
scheduler. Everything is fine when the line with resources_max.file is
removed from the queue setup.
My configuration is:
queue definition
-------------------------
% qmgr -c 'l q SM5'
Queue SM5
queue_type = Execution
Priority = 6
total_jobs = 3
state_count = Transit:0 Queued:3 Held:0 Waiting:0 Running:0 Exiting:0
max_running = 200
from_route_only = True
resources_max.file = 1950mb <-----------!!!!!!
resources_max.nodect = 1
resources_max.pcput = 24:00:00
resources_max.pmem = 256mb
resources_max.pvmem = 350mb
resources_min.nodect = 1
resources_default.neednodes = 1:medium5
resources_default.nice = 15
resources_default.nodect = 1
resources_default.nodes = 1:medium5
resources_assigned.nodect = 0
max_user_run = 170
enabled = True
started = True
----------
maui checkjob status:
----------
% checkjob -v 1001
checking job 1001 (RM job '1001.h1farm03.desy.de')
State: Idle
Creds: user:bogdan group:h1 class:SM5 qos:DEFAULT
WallTime: 00:00:00 of 00:00:00
SubmitTime: Wed Apr 26 19:52:19
(Time Queued Total: 00:00:22 Eligible: 00:00:22)
Total Tasks: 1
Req[0] TaskCount: 1 Partition: ALL
Network: [NONE] Memory >= 0 Disk >= 1950M Swap >= 0
Opsys: [NONE] Arch: [NONE] Features: [medium5][1]
Exec: '' ExecSize: 0 ImageSize: 0
Dedicated Resources Per Task: PROCS: 1 MEM: 256M SWAP: 350M DISK: 1950M
NodeAccess: SHARED
NodeCount: 0
IWD: [NONE] Executable: [NONE]
Bypass: 0 StartCount: 0
PartitionMask: [ALL]
Flags: RESTARTABLE
PE: 15600.00 StartPriority: 0
job cannot run in partition DEFAULT (idle procs do not meet requirements :
0 of 1 procs found)
idle procs: 12 feasible procs: 0
Rejection Reasons: [Features : 4]
Detailed Node Availability Information:
h1bombeiros.desy.de rejected : Features
h1farm150.desy.de rejected : Features
h1farm152.desy.de rejected : Features
h1farm157.desy.de rejected : Features
--------
maui checknode shows:
--------
% checknode h1farm150.desy.de
checking node h1farm150.desy.de
State: Idle (in current state for 00:12:27)
Configured Resources: PROCS: 4 MEM: 1007M SWAP: 1734M DISK: 1M
Utilized Resources: [NONE]
Dedicated Resources: [NONE]
Opsys: linux Arch: farm
Speed: 1.00 Load: 0.000
Network: [DEFAULT]
Features:
[short5][xshort5][medium5][long5][bigmedium5][biglong5][oo5][mc5]
Attributes: [Batch]
Classes: [SM5 4:4][bigmemM5 4:4][qoo 4:4][oo 4:4][SX5 4:4][mc5 4:4][BL5
4:4][SL5 4:4][BM5 4:4][SC5 4:4][mc2 0:4][qmc2 0:4][qmc5 4:4]
Total Time: 5:42:45 Up: 5:42:45 (100.00%) Active: 00:52:52 (15.42%)
Reservations:
NOTE: no reservations on node
------------
As I see the problem is due to the fact that maui does not recognize
properly configured DISK resource, more exactly: maui returns bad value
for DISK resource.
checkjob shows:
% checkjob -v 1001
...
Dedicated Resources Per Task: PROCS: 1 MEM: 256M SWAP: 350M DISK: 1950M
^^^^^^^^^^^
...
In this case: DISK: 1950M comes from queue setup.
While checknode returns:
% checknode h1farm150.desy.de
...
Configured Resources: PROCS: 4 MEM: 1007M SWAP: 1734M DISK: 1M
^^^^^^^^
...
No idea which parameter determines such value: DISK: 1M .
So, jobs cannot be started because DISK requirements (1950M) is higher
then Configured DISK Resource (1M) .
I will be grateful for any informations about
maui patches removing mentioned behavior or hints how to avoid this
conflict with file limits .
Best Regards,
Bogdan
_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers