Second post today, separate topic...

I've got a few machines set up as active/passive clusters running  
heartbeat/drbd.  I am currently monitoring them with ganglia, but I  
think the information I'm getting leads to a misleading picture.

Since both machines are monitored, it looks like I have 8 processors  
in the cluster (4 each in 2 boxes).  But in reality, only 1 of these  
machines is ever available at 1 time.  I am keeping a mental note to  
myself that any time these clusters are more than 50% utilized,  
they're really >100% utilized, since the CPUs, RAM, etc from the  
passive node really shouldn't count in the totals.  Always having to  
drill down to the level of the individual machine to see what's going  
on is kind of a pain.

The only solution I've thought of is to keep gmond turned off on the  
passive node, and starting it during a resource migration.  This would  
be easy enough, but it would have 2 drawbacks :
1. My stats would say 50% of my cluster is 'down' although it's  
functioning correctly.
2. It is sometimes useful to monitor stuff on the passive node, and I  
don't really want to lose that ability.

Any better ways to do this?  Maybe extend the PHP frontend to be  
configurable for monitoring active/passive?  (Would anyone else have a  
use for that besides me?)

thanks,
alex

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
Ganglia-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to