i use 'checknode' to check the whole nodes.the result is that every nodes
is idle ,except two nodes --c1502 and c1506.but the pbs_mom is runing at
c1502 and c1506. i use qmgr to show nodes info shows the nodes is free ,but
use checknode it show the nodes down. and these two nodes can't be
scheduled to run job. i am so puzzled .
I have restart the pbs_mom at c1502 and c1506 , pbs_server for several
times ,but it did not change anymore.
[EMAIL PROTECTED] bin]# qmgr
Max open servers: 4
Qmgr: p n c1502
#
# Create nodes and set their properties.
#
#
# Create and define node c1502
#
# create node c1502 # unsupported operation
set node c1502 state = free
set node c1502 properties = dpool
set node c1502 ntype = cluster
set node c1502 status = opsys=linux
set node c1502 status += uname=Linux c1502 2.6.9-5.ELsmp #1 SMP Wed Jan 5
19:30:39 EST 2005 i686
set node c1502 status += sessions=? 0
set node c1502 status += nsessions=? 0
set node c1502 status += nusers=0
set node c1502 status += idletime=0
set node c1502 status += totmem=3066284kb
set node c1502 status += availmem=3040300kb
set node c1502 status += physmem=1034676kb
set node c1502 status += ncpus=2
set node c1502 status += loadave=0.00
set node c1502 status += message=ERROR: torque spool filesystem full
set node c1502 status += netload=3071813
set node c1502 status += state=free
set node c1502 status += jobs=? 0
set node c1502 status += rectime=1165565461
[EMAIL PROTECTED] bin]# checknode c1502
checking node c1502
State: Down (in current state for 00:00:00)
Configured Resources: PROCS: 2 MEM: 1010M SWAP: 2969M DISK: 1M
Utilized Resources: PROCS: 2
Dedicated Resources: [NONE]
Opsys: linux Arch: [NONE]
Speed: 1.00 Load: 0.000
Network: [DEFAULT]
Features: [dpool]
Attributes: [Batch]
Classes: [dpool 2:2]
Total Time: 21:07:23:55 Up: 00:00:00 (0.00%) Active: 00:00:00 (0.00%)
Reservations:
NOTE: no reservations on node
Max open servers: 4
Qmgr: p n c1506
#
# Create nodes and set their properties.
#
#
# Create and define node c1506
#
# create node c1506 # unsupported operation
set node c1506 state = free
set node c1506 properties = dpool
set node c1506 ntype = cluster
set node c1506 status = opsys=linux
set node c1506 status += uname=Linux c1506 2.6.9-5.ELsmp #1 SMP Wed Jan 5
19:30:39 EST 2005 i686
set node c1506 status += sessions=2864
set node c1506 status += nsessions=1
set node c1506 status += nusers=1
set node c1506 status += idletime=5985
set node c1506 status += totmem=3066284kb
set node c1506 status += availmem=2927328kb
set node c1506 status += physmem=1034676kb
set node c1506 status += ncpus=2
set node c1506 status += loadave=0.00
set node c1506 status += message=ERROR: torque spool filesystem full
set node c1506 status += netload=51812000
set node c1506 status += state=free
set node c1506 status += jobs=? 0
set node c1506 status += rectime=1165565680
[EMAIL PROTECTED] bin]# checknode c1506
checking node c1506
State: Down (in current state for 00:00:00)
Configured Resources: PROCS: 2 MEM: 1010M SWAP: 2858M DISK: 1M
Utilized Resources: PROCS: 2
Dedicated Resources: [NONE]
Opsys: linux Arch: [NONE]
Speed: 1.00 Load: 0.000
Network: [DEFAULT]
Features: [dpool]
Attributes: [Batch]
Classes: [dpool 2:2]
Total Time: 9:17:05:10 Up: 00:00:00 (0.00%) Active: 00:00:00 (0.00%)
Reservations:
NOTE: no reservations on node
_________________________________________________________________
与联机的朋友进行交流,请使用 MSN Messenger: http://messenger.msn.com/cn
_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers