What is the recommended way of identifying jobs which are consuming more CPU 
than they’ve requested?  I have an environment set up where people mostly 
submit SMP jobs through a parallel environment and we can use this information 
to schedule them appropriately.  We’ve had several cases though where the jobs 
have used significantly more cores on the machine they’re assigned to than they 
requested, so the nodes become overloaded and go into an alarm state.

What options do I have for monitoring the number of cores simultaneously used 
by a job and comparing this to the number which were requested so I can find 
cases where the actual usage is way above the request and kill them?

Thanks

Simon.
The Babraham Institute, Babraham Research Campus, Cambridge CB22 3AT Registered 
Charity No. 1053902.
The information transmitted in this email is directed only to the addressee. If 
you received this in error, please contact the sender and delete this email 
from your system. The contents of this e-mail are the views of the sender and 
do not necessarily represent the views of the Babraham Institute. Full 
conditions at: www.babraham.ac.uk<http://www.babraham.ac.uk/terms>
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to