Re: [gridengine users] Jobs sitting in queue despite suitable slots and resources available

Joshua Baker-LePain Fri, 13 Apr 2018 11:05:32 -0700

n Fri, 13 Apr 2018 at 1:48am, William Hay wrote

This looks more like the scheduler and qmaster threads of the qmasterdisagreeing about the number of gpu left. This shouldn't persist butbouncing the qmaster might get them to agree.

That is indeed exactly what it seems like is going on. However I've triedbouncing the qmaster, and the problem persists after the restart.

It looks like you are defining the gpu as a host consumable. Is thereanything else that defines it: Queue consumable, global consumable,resource quota or load sensor?

AFAIK, no. They only place the "gpu" variable occurs is in the hostdefinition "complex_values", e.g.:


$ qconf -se msg-iogpu9
hostname              msg-iogpu9
load_scaling          NONE
complex_values        mem_free=128000M,gpu=2
load_values           arch=lx-amd64,num_proc=32,mem_total=128739.226562M, \
                      swap_total=4095.996094M,virtual_total=132835.222656M, \
                      
m_topology=SCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTT, \
                      m_socket=2,m_core=16,m_thread=32,load_avg=5.570000, \
                      load_short=5.560000,load_medium=5.570000, \
                      load_long=5.530000,mem_free=124723.488281M, \
                      swap_free=4095.996094M,virtual_free=128819.484375M, \
                      mem_used=4015.738281M,swap_used=0.000000M, \
                      virtual_used=4015.738281M,cpu=21.100000, \
                      
m_topology_inuse=SCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTT, \
                      gpu.ncuda=2,gpu.ndev=2,gpu.cuda.0.mem_free=752222208, \
                      gpu.cuda.0.procs=1,gpu.cuda.0.clock=1911, \
                      gpu.cuda.0.util=91,gpu.cuda.1.mem_free=752222208, \
                      gpu.cuda.1.procs=1,gpu.cuda.1.clock=1911, \
                      gpu.cuda.1.util=90,gpu.names=GeForce GTX 1080 Ti;GeForce \
                      GTX 1080 Ti;,np_load_avg=0.174063, \
                      np_load_short=0.173750,np_load_medium=0.174063, \
                      np_load_long=0.172813
processors            32
user_lists            NONE
xuser_lists           NONE
projects              NONE
xprojects             NONE
usage_scaling         NONE
report_variables      NONE

and the complex definition I sent previously. I am running the bundledload sensor, as can be seen above.

What do you get if you use
qstat -F gpu -q 'gpu-q@msg-iogpu[29]'


$ qstat -F gpu -q gpu.q@msg-iogpu9
queuename                      qtype resv/used/tot. load_avg arch          
states
---------------------------------------------------------------------------------
gpu.q@msg-iogpu9               BP    0/3/16         5.51     lx-amd64
        hc:gpu=0

Any ideas?  I'm a bit flummoxed on this one...
Set MONITOR=1 in the scheduler's params and have a look at the schedulefile should tell you what the scheduler is doing.

I've had that set for a while now. JobID 373163 is currently stuck in thequeue with appropriate slots avaiable:


$ qstat -u "*" -q gpu.q
.
.
 373163 0.50000 refine_joi user1       qw    04/13/2018 09:25:08                
                    3

$ qalter -w p 373163
verification: found possible assignment with 3 slots

The qmaster "messages" file has this to say about that job ID (repeatedly-- this is the first mention and coincides with the submit time above):

04/13/2018 09:25:32|worker|wynq1|E|debiting 2.000000 of gpu on host msg-iogpu9 
for 1 slots would exceed remaining capacity of 0.000000
04/13/2018 09:25:32|worker|wynq1|E|resources no longer available for start of 
job 373163.1

And this is that job's first mention in the schedule file:
373163:1:STARTING:1523638696:82860:P:mpi_onehost:slots:3.000000
373163:1:STARTING:1523638696:82860:H:msg-iogpu9:mem_free:51539607552.000000
373163:1:STARTING:1523638696:82860:H:msg-iogpu9:gpu:2.000000
373163:1:STARTING:1523638696:82860:Q:gpu.q@msg-iogpu9:slots:3.000000

That block is repeated over and over in that file.

Also enabling sched_job_info for the job in question and then runningqstat -j on it after the next scheduling cycle might provide some clues.

Unfortunately that seems a bust as well. It just details all the queueinstances the job can't run in (all legitimate). It doesn't mention thequeue instances it *can* run in at all.

So the short version again is some part of the scheduler seems to thinkthere are available "gpu" complex slots on hosts where there aren't, andanother part of the scheduler realizes this and keeps the jobs requestingthose slots from starting. But it also won't try different hosts.


--
Joshua Baker-LePain
QB3 Shared Cluster Sysadmin
UCSF
_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Re: [gridengine users] Jobs sitting in queue despite suitable slots and resources available

Reply via email to