Hi List,

can anyone give me a hint as to what scheduler performance to expect, and what 
would typically be the bottleneck? We have 6.2u5 running here, and one 
scheduler run takes about 5 minutes (with 600 jobs and 800 nodes).

>From what I've seen with params monitor=1 and strace, the scheduler[1] has a 
>list of running jobs almost instantaneously, then spends about four minutes at 
>100% CPU writing nothing to common/schedule (and actually not doing any system 
>calls but futex() and write (stdout). During that time, it spews a lot of 
>diagnostic messages about resource utilization to stdout (see below[2]). 
>Finally, reservations are made (they take about four seconds each, which is 
>not exactly fast, but quite manageable), and jobs are started (very quickly).

Is such a long delay between the :RUNNING: and :RESERVING: lines normal? I've 
thought our disk may be at fault here -- /var is often maxed out in terms of 
bandwidth. But then again, the thread with 100% CPU doesn't do any read() calls.


Thanks,

A.



[1] At least, I assume that's the scheduler -- it is simply the thread with the 
highest CPU percentage.
[2] It looks more or less like this:
-------------------------------
RUE_name             (String)  * = ////node11-33/
RUE_utilized_now     (Double)    = 0.000000
RUE_utilized         (List)      = empty
RUE_utilized_now_non (Double)    = 0.000000
RUE_utilized_nonexcl (List)      = empty


-- 
Ansgar Esztermann
DV-Systemadministration
Max-Planck-Institut für biophysikalische Chemie, Abteilung 105


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to