Hi,
we try to configure maui on a separate torque server with the same working
configuration from our current setup.
Maui reports that the node is overcommitted. What looks strange to us is that
checknode does only show a DISK
as "Configured Resources". On our working implementation we see PROCS, MEM,
SWAP and DISK.
We tried already different simple reservations, so we think that the problem
may be the "Configured Resources"
[r...@lrms02 yum.repos.d]# checknode wn101
checking node wn101.lcg.cscs.ch
State: Idle (in current state for 00:14:24)
Configured Resources: DISK: 1M
Utilized Resources: SWAP: 272M
Dedicated Resources: [NONE]
Opsys: linux Arch: [NONE]
Speed: 1.00 Load: 0.000
Network: [DEFAULT]
Features: [lcgpro]
Attributes: [Batch]
Classes: [other 16:16]
Total Time: 4:08:27:30 Up: 18:13:13 (17.44%) Active: 00:00:00 (0.00%)
Reservations:
User '.0.0'(x1) -00:14:24 -> INFINITY ( INFINITY)
Blocked resour...@-00:14:24 Procs: 2/1 (200.00%)
User '.0.1'(x1) -00:14:24 -> INFINITY ( INFINITY)
Blocked resour...@-00:14:24 Procs: 2/1 (200.00%)
ALERT: node is overcommitted at time -00:14:24 (P: 12)
and here the output from the working server:
[r...@ce01 ~]# checknode wn10
checking node wn10.lcg.cscs.ch
State: Running (in current state for 00:00:00)
Configured Resources: PROCS: 16 MEM: 31G SWAP: 76G DISK: 1M
Utilized Resources: [NONE]
Dedicated Resources: PROCS: 4 MEM: 8000M
Opsys: linux Arch: [NONE]
Speed: 1.00 Load: 3.300
Network: [DEFAULT]
Features: [lcgpro]
Attributes: [Batch]
Classes: [atlas 16:16][cms 13:16][lhcb 15:16][lcgadmin 16:16][ops
16:16][other 16:16][cscs 16:16]
Total Time: INFINITY Up: INFINITY (98.02%) Active: INFINITY (86.39%)
Reservations:
Job '3509992'(x1) -1:12:07:47 -> 23:52:13 (2:12:00:00)
Job '3514824'(x1) -15:37:52 -> 1:20:22:08 (2:12:00:00)
Job '3516656'(x1) -7:47:44 -> 2:04:12:16 (2:12:00:00)
Job '3516756'(x1) -7:07:35 -> 2:04:52:25 (2:12:00:00)
User 'atlas.0.0'(x1) -INFINITY -> INFINITY ( INFINITY)
Blocked resour...@00:00:00 Procs: 1/1 (100.00%)
User 'cms.0.0'(x1) -INFINITY -> INFINITY ( INFINITY)
Blocked resour...@00:00:00 Procs: 0/1 (0.00%)
Blocked resour...@2:04:52:25 Procs: 1/1 (100.00%)
User 'lhcb.0.0'(x1) -INFINITY -> INFINITY ( INFINITY)
Blocked resour...@00:00:00 Procs: 0/1 (0.00%)
Blocked resour...@23:52:13 Procs: 1/1 (100.00%)
User 'sam_and_sgm.0.0'(x1) -INFINITY -> INFINITY ( INFINITY)
Blocked resour...@00:00:00 Procs: 1/1 (100.00%)
JobList: 3509992,3514824,3516656,3516756
We're using the following versions of torque and maui
torque-server.x86_64 2.3.6-2cri.el5
maui-server.x86_64 3.2.6p21-snap.1234905291.5.el5
and on the working server:
maui-server.i386 3.2.6p21-snap.12247061
torque-server.i386 2.3.6-1cri.slc4
Has anybody ran into this problem or has a clue what's going on?
Cheers,
Peter
_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers