Hallo to distinguished forum members,

We are using SoGE 8.1.8 and since recently approximately 2 months ago our job 
schedule time raised up to 30-60 sec.

Our environment details:

            We have about 3-4K of concurrent jobs.
            We have about 300 physical and 200 VMs as execution hosts.
            In total we have about 5K cores (slots).
            We have a lot of different projects/parallel environments and 
quotas configured.

We perform simple check to see how much it takes to schedule the job:

>time qrsh date
Mon Oct 10 17:35:31 IDT 2016
0.015u 0.010s 0:22.38 0.0%      0+0k 0+0io 0pf+0w

Previously we tried to clean all the running jobs and the schedule time dropped 
to 1 sec, but then again when more jobs were on the queue the schedule time 
raised up to 10-20 sec and now we have 30-60 sec.

Any tips and advices where to look for the root cause and/or how can we improve 
the situation, will be greatly appreciated.
Thank You.


Yuri Burmachenko | Sr. Engineer | IT | Mellanox Technologies Ltd.
Work: +972 74 7236386 | Cell +972 54 7542188 |Fax: +972 4 959 3245
Follow us on Twitter<http://twitter.com/mellanoxtech> and 
Facebook<http://www.facebook.com/pages/Mellanox-Technologies/223164879116>

_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to