HI, Am 07.03.2013 um 19:24 schrieb Pablo Escobar:
> I have previously used load_sensors to disable a queue when one of my > filesystems reach 98% of occupation but I don't know how to do the What did you observe there? Usually it disables a queue instance but never a complete queue, hence it's already doing what you would like to have. I assume as the threshold was just passing the limit for all queue instances at the same time, you just saw acting it on a global level - although each was disabled on its own instead. > same to disable a single exec node, not a full queue. In the man I see > the "suspend_threshold" is only available for queues but not for exec > nodes. load_threshold is the entry you need. If this tests a BOOL variable which is set by the load_sensor, you can disable the desired exechost. No more jobs will be scheduled to a particular machine in alarm state then. -- Reuti > I would like to disable single exec nodes in case the node can't acces > /home. Exactly what I am trying to achieve is to run a load_sensor in > every exec node just doing 'ls /home/username' and if this > load_sensor returns a FALSE (can't access the filesystem) then just > disable the node so it doesn't accept more jobs until the problem is > solved. > > is this possible using load_sensors or should try a different approach? > > many thanks in advance for any help > Pablo. > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
