Hello,

We have a Rocks cluster with two nodes each with 1 GPU and 32 CPUS. I
have defined a separate queue (gpu.q) containing these two hosts
compute-4-0 and compute-4-1.

Since we have only 1 gpu per node, i want the first job to be run on
compute-4-0 and the second job to be run on compute-4-1. However, with
my current configuration the second submitted job also tries to
execute on compute-4-0. How can i force the job to execute on
compute-4-1 if compute-4-0 is already running something, and
subsequent submitted jobs should end up in 'qw'?

My config is as follows:

 #qconf -sq gpu.q
qname gpu.q
hostlist @gpuhosts
load_thresholds np_load_avg=1.75
processors UNDEFINED
slots 1,[compute-4-0.local=32],[compute-4-1.local=32]
complex_values NONE
----------------------------------------
#qconf -sc
gpu gpu INT <= YES YES 0 0
----------------------------------------
#qconf -se compute-4-0.local
hostname compute-4-0.local
load_scaling NONE
complex_values gpu=1
load_values arch=linux-x64,num_proc=32,mem_total=129091.453125M, \
swap_total=999.996094M,virtual_total=130091.449219M, \
load_avg=0.970000,load_short=0.990000, \
load_medium=0.970000,load_long=0.910000, \
mem_free=104392.453125M,swap_free=999.996094M, \
virtual_free=105392.449219M,mem_used=24699.000000M, \
swap_used=0.000000M,virtual_used=24699.000000M, \ cpu=3.200000, \
m_topology=SCCCCCCCCCCCCCCCCSCCCCCCCCCCCCCCCC, \
m_topology_inuse=SCCCCCCCCCCCCCCCCSCCCCCCCCCCCCCCCC, \
m_socket=2,m_core=32,np_load_avg=0.030312, \
np_load_short=0.030937,np_load_medium=0.030312, \
np_load_long=0.028438
processors 32
-----------------------------------------------
#qconf -shgrp @gpuhosts
 group_name @gpuhosts
hostlist compute-4-0.local compute-4-1.local


Thanks,
Rajil
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to