when i submit serial job to the cluster by torque+maui,the job always run
at the same node untill all the cpus of that node is used .
for example ,I use "lx" account to submit serial job "dfdf", every nodes
have two cpus.and now every cpu is free,no job is running.
[EMAIL PROTECTED] ~]$ qsub -l nodes=1:ppn=1 dfdf
101.console
[EMAIL PROTECTED] ~]$ qstat -an
console:
Req'd
Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory
Time S Time
-------------------- -------- -------- ---------- ------ ----- --- ------
----- - -----
101.console lx dpool dfdf 4012 1 -- --
-- R --
c1501/0
[EMAIL PROTECTED] ~]$ qsub -l nodes=1:ppn=1 dfdf
102.console
[EMAIL PROTECTED] ~]$ qstat -an
console:
Req'd
Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory
Time S Time
-------------------- -------- -------- ---------- ------ ----- --- ------
----- - -----
101.console lx dpool dfdf 4012 1 -- --
-- R --
c1501/0
102.console lx dpool dfdf 4102 1 -- --
-- R --
c1501/1
[EMAIL PROTECTED] ~]$ qsub -l nodes=1:ppn=1 dfdf
103.console
[EMAIL PROTECTED] ~]$ qstat -an
console:
Req'd
Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory
Time S Time
-------------------- -------- -------- ---------- ------ ----- --- ------
----- - -----
101.console lx dpool dfdf 4012 1 -- --
-- R --
c1501/0
102.console lx dpool dfdf 4012 1 -- --
-- R --
c1501/1
103.console lx dpool dfdf 3543 1 -- --
-- R --
c1503/0
as you see,the first two jobs are running at the same node.this is not load
balance.i want the job 102 run at the other node not the node c1501.after
all the cpus of the node c1501 is used,the job 103 is starting to run at
the other node c1503.
i have configured the torque server by using "node_pack=false",but it not
works.
and i also configure the maui.cfg file ,adding "NODEALLOCATIONPOLICY
MAXBALANCE
NODEACCESSPOLICY SINGLEUSER",but it still not works.
i am very disappointed,how can i do .
this is my server's configuration.
#
# Create queues and set their attributes.
#
#
# Create and define queue dpool
#
create queue dpool
set queue dpool queue_type = Execution
set queue dpool max_queuable = 50
set queue dpool max_running = 50
set queue dpool resources_default.neednodes = dpool
set queue dpool enabled = True
set queue dpool started = True
#
# Set server attributes.
#
set server scheduling = True
set server acl_host_enable = False
set server managers = [EMAIL PROTECTED]
set server operators = [EMAIL PROTECTED]
set server default_queue = dpool
set server log_events = 127
set server mail_from = adm
set server scheduler_iteration = 300
set server node_check_rate = 150
set server tcp_timeout = 6
set server node_pack = False
set server torque_version = 2.0.0p8
_________________________________________________________________
免费下载 MSN Explorer: http://explorer.msn.com/lccn/
_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers