Hi,

On May 4, 2011, at 2:01 , William Deegan wrote:

> What's the best way to take a node offline for maintenence?

I'd say that depends on the circumstances. If you want to perform maintenance 
on a single node, as soon as possible (e.g. the node has ECC or SMART errors), 
use qmod -d to disable it, then take it down when all current jobs have 
finished.
On the other hand, you might want to create a maintenance window (possibly for 
several nodes) at some point in the future. In that case, disabling the nodes 
now might leave them unoccupied for several days when they are perfectly suited 
to running a few jobs instead. So, use an advance reservation to ensure that 
they are free when your maintenance window starts; this way, they can still 
accept (short) jobs until then.
Another possibility (especially when you have to do maintenance on all nodes, 
e.g. a planned power cut) would be to use a calendar.


Regards,

A.
-- 
Ansgar Esztermann
DV-Systemadministration
Max-Planck-Institut für biophysikalische Chemie, Abteilung 105


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to