On Tue, Mar 15, 2011 at 6:12 AM, Esztermann, Ansgar <[email protected]> wrote: > Thanks, that has helped a bit. I know now that most of the CPU time is spent > in the dispatching stage. However, it is still unclear to me why dispatching > should be such a time-consuming task.
Good, at least we are on the right track!! :-) The stage is called "job dispatching" but it really is not about sending job start requests to the execution hosts -- in fact, the scheduler thread does not talk to the execution hosts directly. The job dispatching stage (daemons/qmaster/sge_sched_thread.c:dispatch_jobs()) in the scheduler tries to find a queue instance (think of it is a host or slot) that is suitable for running the job. With a few hundred jobs, grid engine (this applies to SGE forks like Open Grid Scheduler or Son of GE -- as we have not changed the scheduler code yet) can easily the load. But as your cluster is spending 5 minutes to decide where the jobs should go, I'm curious what kind of resource requirements do they have, and most importantly, do they have soft request specified?? Rayson > > 03/15/2011 10:53:56|schedu|master1|P|PROF: job dispatching took 327.370 s (0 > fast, 0 fast_soft, 8 pe, 0 pe_soft, 4 res) > 03/15/2011 10:53:56|schedu|master1|P|PROF: parallel matching 878 > 262664 2634 159203 137234 159203 131007 > 03/15/2011 10:53:56|schedu|master1|P|PROF: sequential matching 0 > 0 0 0 0 0 0 > 03/15/2011 10:53:56|schedu|master1|P|PROF: create pending job orders: 0.000 s > 03/15/2011 10:53:56|schedu|master1|P|PROF: scheduled in 327.450 (u 337.270 + > s 7.960 = 345.230): 0 sequential, 0 parallel, 452 orders, 846 H, 214 Q, 839 > QA, 10 J(qw), 431 J(r), 0 J(s), 0 J(h), 0 J(e), 0 J(x), 449 J(all), 57 C, 3 > ACL, 149 PE, 12 U, 1 D, 0 PRJ, 1 ST, 0 CKPT, 0 RU, 1 gMes, 0 jMes, 452/3 > pre-send, 0/0/0 pe-alg > > 03/15/2011 10:53:56|schedu|master1|P|PROF: send orders and cleanup took: > 0.020 (u 0.020,s 0.000) s > 03/15/2011 10:53:56|schedu|master1|P|PROF: schedd run took: 327.630 s (init: > 0.000 s, copy: 0.130 s, run:327.470, free: 0.030 s, jobs: 449, categories: > 43/0) > > > A. > > -- > Ansgar Esztermann > DV-Systemadministration > Max-Planck-Institut für biophysikalische Chemie, Abteilung 105 > > _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
