Hello,
On Thu, Jan 22, 2009 at 7:45 PM, Craig West <[email protected]> wrote:
> Are you able to run more than one job on a single node at all?
Yes, that works.
> Are you also specifying RAM or other restrictions in the qsub, or default
> settings for the queue in torque?
Yes, the requested resources of the jobs (as printed by qstat -f) are
as follows:
*6 cpu job:
queue = default
Resource_List.neednodes = 1:ppn=6
Resource_List.nodect = 1
Resource_List.nodes = 1:ppn=6
Resource_List.vmem = 11000mb
Resource_List.walltime = 96:00:00
*1 cpu job:
queue = default
Resource_List.cput = 672:00:00
Resource_List.neednodes = 1:ppn=1
Resource_List.nodect = 1
Resource_List.nodes = 1:ppn=1
Resource_List.pcput = 672:00:00
Resource_List.vmem = 7168mb
Resource_List.walltime = 672:00:00
A checknode on a node running a 6 cpu job shows:
State: Running (in current state for 00:00:00)
Configured Resources: PROCS: 8 MEM: 31G SWAP: 25G DISK: 1M
Utilized Resources: [NONE]
Dedicated Resources: PROCS: 6 SWAP: 10G
Opsys: linux Arch: [NONE]
Speed: 1.00 Load: 5.080
Location: Partition: DEFAULT Frame/Slot: 1/1
Network: [DEFAULT]
Features: [NONE]
Attributes: [Batch]
Classes: [default 2:8][small 8:8]
Total Time: INFINITY Up: INFINITY (99.71%) Active: INFINITY (53.70%)
Reservations:
Job '38882'(x6) -5:10:00 -> 3:18:50:00 (4:00:00:00)
JobList: 38882
A checknode on a node running two of the 1 cpu job shows:
State: Running (in current state for 00:00:00)
Configured Resources: PROCS: 8 MEM: 31G SWAP: 23G DISK: 1M
Utilized Resources: [NONE]
Dedicated Resources: PROCS: 2 SWAP: 14G
Opsys: linux Arch: [NONE]
Speed: 1.00 Load: 2.000
Location: Partition: DEFAULT Frame/Slot: 1/1
Network: [DEFAULT]
Features: [NONE]
Attributes: [Batch]
Classes: [default 6:8][small 8:8]
Total Time: INFINITY Up: INFINITY (99.45%) Active: INFINITY (55.08%)
Reservations:
Job '39090'(x1) -5:19:39:16 -> 22:04:20:44 (28:00:00:00)
Job '39091'(x1) -5:19:39:15 -> 22:04:20:45 (28:00:00:00)
JobList: 39090,39091
> What version of Maui and Torque are you running?
torque 2.1.8 and maui 3.2.6p19.
> A copy of your maui.cfg might help.
An excerpt from our maui.cfg:
RMPOLLINTERVAL 00:00:30
SERVERMODE NORMAL
RMCFG[base] TYPE=PBS
LOGFILE maui.log
LOGFILEMAXSIZE 10000000
LOGLEVEL 3
QUEUETIMEWEIGHT 1
FSPOLICY DEDICATEDPS
FSDEPTH 7
FSINTERVAL 86400
FSDECAY 0.80
BACKFILLPOLICY FIRSTFIT
RESERVATIONPOLICY CURRENTHIGHEST
NODEALLOCATIONPOLICY MINRESOURCE
USERCFG[DEFAULT] FSTARGET=20.0+
FSWEIGHT 10
FSUSERWEIGHT 100
ENFORCERESOURCELIMITS ON
RESOURCELIMITPOLICY[0] MEM:ALWAYS:CANCEL
SRCFG[small] TASKCOUNT=1 RESOURCES=PROCS:4,MEM:16384
SRCFG[small] HOSTLIST=cluster1.local
SRCFG[small] PERIOD=INFINITY
SRCFG[small] TIMELIMIT=1:00:00
SRCFG[small] CLASSLIST=small
The general idea behind this config is to have 2 queues: a default one
for 32 nodes and one for small jobs (with a walltime of maximum one
hour) which run on one dedicated host.
Greetings,
Lech
_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers