Hi, 

    I run ganglia on 180 nodes cluster, it works well. I am monitoring the cpu 
temperature  by running a script on each node which updates regularely the 
TempCPU metrics. I have another script on the server  which execute "ganglia 
TempCPU" I take the return value to verify if the value exceed a limit and if 
it is the case shut down the node with home made controler module. This works 
great. 

   The problem is if I want to restart  the node, his old TempCPU value stays 
in ganglia memory. So the server script shutdowns the node again because the 
TempCPU value still exceed the limit.

        Is there a way for the server script to alter the TempCpu value of the 
shutdowned node to put a tag value ( exemple : -1) ?


        Thanks

        Karl


Reply via email to