Hi,

we try to configure maui on a separate torque server with the same working 
configuration from our current setup.
Maui reports that the node is overcommitted. What looks strange to us is that 
checknode does only show a DISK
as "Configured Resources". On our working implementation we see PROCS, MEM, 
SWAP and DISK.
We tried already different simple reservations, so we think that the problem 
may be the "Configured Resources"


[r...@lrms02 yum.repos.d]# checknode wn101


checking node wn101.lcg.cscs.ch

State:      Idle  (in current state for 00:14:24)
Configured Resources: DISK: 1M
Utilized   Resources: SWAP: 272M
Dedicated  Resources: [NONE]
Opsys:         linux  Arch:      [NONE]
Speed:      1.00  Load:       0.000
Network:    [DEFAULT]
Features:   [lcgpro]
Attributes: [Batch]
Classes:    [other 16:16]

Total Time: 4:08:27:30  Up: 18:13:13 (17.44%)  Active: 00:00:00 (0.00%)

Reservations:
  User '.0.0'(x1)  -00:14:24 ->   INFINITY (  INFINITY)
    Blocked resour...@-00:14:24   Procs: 2/1 (200.00%)
  User '.0.1'(x1)  -00:14:24 ->   INFINITY (  INFINITY)
    Blocked resour...@-00:14:24   Procs: 2/1 (200.00%)
ALERT:  node is overcommitted at time -00:14:24 (P: 12)


and here the output from the working server:

[r...@ce01 ~]# checknode wn10


checking node wn10.lcg.cscs.ch

State:   Running  (in current state for 00:00:00)
Configured Resources: PROCS: 16  MEM: 31G  SWAP: 76G  DISK: 1M
Utilized   Resources: [NONE]
Dedicated  Resources: PROCS: 4  MEM: 8000M
Opsys:         linux  Arch:      [NONE]
Speed:      1.00  Load:       3.300
Network:    [DEFAULT]
Features:   [lcgpro]
Attributes: [Batch]
Classes:    [atlas 16:16][cms 13:16][lhcb 15:16][lcgadmin 16:16][ops 
16:16][other 16:16][cscs 16:16]

Total Time:   INFINITY  Up:   INFINITY (98.02%)  Active:   INFINITY (86.39%)

Reservations:
  Job '3509992'(x1)  -1:12:07:47 -> 23:52:13 (2:12:00:00)
  Job '3514824'(x1)  -15:37:52 -> 1:20:22:08 (2:12:00:00)
  Job '3516656'(x1)  -7:47:44 -> 2:04:12:16 (2:12:00:00)
  Job '3516756'(x1)  -7:07:35 -> 2:04:52:25 (2:12:00:00)
  User 'atlas.0.0'(x1)   -INFINITY ->   INFINITY (  INFINITY)
    Blocked resour...@00:00:00    Procs: 1/1 (100.00%)
  User 'cms.0.0'(x1)   -INFINITY ->   INFINITY (  INFINITY)
    Blocked resour...@00:00:00    Procs: 0/1 (0.00%)
    Blocked resour...@2:04:52:25  Procs: 1/1 (100.00%)
  User 'lhcb.0.0'(x1)   -INFINITY ->   INFINITY (  INFINITY)
    Blocked resour...@00:00:00    Procs: 0/1 (0.00%)
    Blocked resour...@23:52:13    Procs: 1/1 (100.00%)
  User 'sam_and_sgm.0.0'(x1)   -INFINITY ->   INFINITY (  INFINITY)
    Blocked resour...@00:00:00    Procs: 1/1 (100.00%)
JobList:  3509992,3514824,3516656,3516756



We're using the following versions of torque and maui
        torque-server.x86_64            2.3.6-2cri.el5
        maui-server.x86_64              3.2.6p21-snap.1234905291.5.el5

and on the working server:
        maui-server.i386                3.2.6p21-snap.12247061
        torque-server.i386      2.3.6-1cri.slc4

Has anybody ran into this problem or has a clue what's going on?

Cheers,

  Peter
_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers

Reply via email to