I try to setting node properties in $PBS_HOME/server_priv/nodes.
Then I submit some parallel jobs, some are running on parallel queue,
others are running on serial queue. Serial jobs are the same.
Accounting logs:
04/03/2006 09:35:56;E;3.i159.ascc;user=wzlu group=wzlu jobname=cpi queue=parallel ctime=1144028142 qtime=1144028142 etime=1144028142 start=1144028142 exec_host=i153.ascc/0+i152.ascc/0 Resource_List.neednodes=2 Resource_List.nodect=2 Resource_List.nodes=2 session=0 end=1144028156 Exit_status=271 resources_used.cput=00:00:00 resources_used.mem=0kb resources_used.vmem=0kb resources_used.walltime=00:00:14
(This parallel job running on i153.ascc and i152.ascc. i153.ascc and i152.ascc are
define in serial queue)
maui.log have following message:
04/03 14:14:51 MPBSNodeUpdate(i154.ascc,i154.ascc,Idle,base)
04/03 14:14:51 MPBSLoadQueueInfo(base,i154.ascc,SC)
04/03 14:14:51 INFO: queue 'batch' started state set to True
04/03 14:14:51 INFO: class to node not mapping enabled for queue 'batch' adding class to all nodes
04/03 14:14:51 INFO: queue 'serial' started state set to True
04/03 14:14:51 INFO: class to node not mapping enabled for queue 'serial' adding class to all nodes
04/03 14:14:51 INFO: queue 'parallel' started state set to True
04/03 14:14:51 INFO: class to node not mapping enabled for queue 'parallel' adding class to all nodes
I try to add "#PBS -l nodes=2:parallel" in job script, all the parallel jobs
running on parallel queue.
Accounting logs:
04/03/2006 09:55:40;E;4.i159.ascc;user=wzlu group=wzlu jobname=cpi queue=parallel ctime=1144029318 qtime=1144029318 etime=1144029318 start=1144029319 exec_host=i156.ascc/0+i155.ascc/0 Resource_List.neednodes=2:parallel Resource_List.nodect=2 Resource_List.nodes=2:parallel session=0 end=1144029340 Exit_status=0 resources_used.cput=00:00:00 resources_used.mem=616kb resources_used.vmem=5276kb resources_used.walltime=00:00:22
(This parallel job running on i156.ascc and i155.ascc. i156.ascc and i155.ascc are
define in parallel queue)
Add "#PBS -l nodes=2:parallel" in job script for most users are inconvenient.
I thinks there are some miss in my system.
Have any idea? Thanks.
My environment is:
OS - RHEL 4 WS 64 bit
torque - 2.0.0p8
maui - 3.2.6p14
serial queue - i151.ascc i152.ascc i153.ascc i154.ascc
parallel queue - i155.ascc i156.ascc
torque configuration as following:
#
# Create queues and set their attributes.
#
#
# Create and define queue batch
#
create queue batch
set queue batch queue_type = Route
set queue batch route_destinations = serial
set queue batch route_destinations += parallel
set queue batch enabled = True
set queue batch started = True
#
# Create and define queue serial
#
create queue serial
set queue serial queue_type = Execution
set queue serial resources_max.nodect = 1
set queue serial resources_default.nodect = 1
set queue serial resources_default.nodes = 1:ppn=1
set queue serial enabled = True
set queue serial started = True
#
# Create and define queue parallel
#
create queue parallel
set queue parallel queue_type = Execution
set queue parallel resources_max.nodect = 64
set queue parallel resources_min.nodect = 2
set queue parallel resources_default.nodect = 2
set queue parallel resources_default.nodes = 2:ppn=1
set queue parallel enabled = True
set queue parallel started = True
#
# Set server attributes.
#
set server scheduling = True
set server acl_host_enable = False
set server acl_user_enable = False
set server default_queue = batch
set server log_events = 511
set server mail_from = adm
set server query_other_jobs = True
set server resources_default.neednodes = 1
set server resources_default.nodes = 1:ppn=1
set server scheduler_iteration = 600
set server node_check_rate = 150
set server tcp_timeout = 6
set server default_node = 1
set server pbs_version = 2.0.0p8-1cri
nodes
i151.ascc serial
i152.ascc serial
i153.ascc serial
i154.ascc serial
i155.ascc parallel
i156.ascc parallel
i157.ascc parallel
maui.cfg
# maui.cfg 3.2.6p14
SERVERHOST i159.ascc
# primary admin must be first in list
ADMIN1 root
# Resource Manager Definition
RMCFG[base] TYPE=PBS
# Allocation Manager Definition
#AMCFG[bank] TYPE=NONE
# full parameter docs at http://clusterresources.com/mauidocs/a.fparameters.html
# use the 'schedctl -l' command to display current configuration
RMPOLLINTERVAL 00:00:30
SERVERPORT 42559
SERVERMODE NORMAL
# Admin: http://clusterresources.com/mauidocs/a.esecurity.html
LOGFILE maui.log
LOGFILEMAXSIZE 10000000
LOGLEVEL 3
# Job Priority: http://clusterresources.com/mauidocs/5.1jobprioritization.html
QUEUETIMEWEIGHT 1
# FairShare: http://clusterresources.com/mauidocs/6.3fairshare.html
#FSPOLICY PSDEDICATED
#FSDEPTH 7
#FSINTERVAL 86400
#FSDECAY 0.80
# Throttling Policies: http://clusterresources.com/mauidocs/6.2throttlingpolicies.html
# NONE SPECIFIED
# Backfill: http://clusterresources.com/mauidocs/8.2backfill.html
BACKFILLPOLICY FIRSTFIT
RESERVATIONPOLICY CURRENTHIGHEST
#NODEALLOCATIONPOLICY MINRESOURCE
NODEALLOCATIONPOLICY CPULOAD
DEFERTIME 0
NODECFG[i151.ascc] PARTITION=SERIAL
NODECFG[i152.ascc] PARTITION=SERIAL
NODECFG[i153.ascc] PARTITION=SERIAL
NODECFG[i154.ascc] PARTITION=SERIAL
NODECFG[i155.ascc] PARTITION=PARALLEL
NODECFG[i156.ascc] PARTITION=PARALLEL
CLASSCFG[serial] MAXJOBPERUSER=4
CLASSCFG[parallel] MAXJOBPERUSER=4
CLASSCFG[parallel] MAXPROCPERUSER=16
USERCFG[DEFAULT] MAXJOB=6 MAXPROC=20
SRPARTITION[serial] SERIAL
SRTASKCOUNT[serial] 4
SRRESOURCES[serial] PROCS=-1
SRCLASSLIST[serial] serial
SRPERIOD[serial] INFINITY
SRPARTITION[parallel] PARALLEL
SRTASKCOUNT[parallel] 2
SRRESOURCES[parallel] PROCS=-1
SRCLASSLIST[parallel] parallel
SRPERIOD[parallel] INFINITY
2006/3/31, Bas van der Vlies <
[EMAIL PROTECTED]>:
I do not use PARTITIONS but i solved the problem by setting node
properties for, eg:
node1 serial
node2 serial
node3 parallel
node4 parallel
In torque to create queue:
parallel...
set queue q_parallel resources_default.neednodes = parallel
set queue q_parallel resources_default.nodect = 2
...
serial...
set queue q_serial resources_default.neednodes = serial
set queue q_serial resources_max.nodect = 1
set queue q_serial resources_default.ncpus = 1
set queue q_serial resources_default.nodect = 1
set queue q_serial resources_default.nodes = 1
--
********************************************************************
* *
* Bas van der Vlies e-mail: [EMAIL PROTECTED] *
* SARA - Academic Computing Services phone: +31 20 592 8012 *
* Kruislaan 415 fax: +31 20 6683167 *
* 1098 SJ Amsterdam *
* *
********************************************************************
_______________________________________________ mauiusers mailing list [email protected] http://www.supercluster.org/mailman/listinfo/mauiusers
