Without actually answering your question, I'm curious about this part:

"The task, in short, is the implementation of real-time-monitoring, i.e.
 a timeframe of about 30 minutes to 3 hours overall. gmond should
 collect the data every 1 or 2 seconds, so state changes can be immediately
 recognized and be dealt with."

It sounds like by changes you mean something that should be classified as 'not 
good' and potentially actionable, like a server running out of connections, or 
memory being swapped, or some other sort of badness which should be alerted on.

My question is:  is the effort to make ganglia report metrics "real-time" (i.e. 
every 1 or 2 seconds) worth it ? While of course every environment is 
different, and sensitivity in each environment to change or failure is also 
different, in my experience, there is past data available (with ganglia, 
especially) in sufficient enough resolution (even every minute) that you can 
predict when bad things are going to happen before they actually do. This 
releases you from having to act on a failure or change with no time to spare.

Nagios and other alerting monitors handle failure scenarios, as I'm sure you 
know, like this. "Warning" thresholds and "critical" thresholds are set to make 
sure that you have time enough to react without it being an absolute emergency.

-john allspaw

----- Original Message ----
From: Thomas Spanner <[EMAIL PROTECTED]>
To: [email protected]
Sent: Saturday, October 27, 2007 7:59:57 AM
Subject: [Ganglia-developers] real-time-monitoring


Hi all!

Last month a friend of mine and I started a project at our university
 (TU Munich) involving your nice little software. This should be our last
 certificate, so we can finally begin our final exams.

The task, in short, is the implementation of real-time-monitoring, i.e.
 a timeframe of about 30 minutes to 3 hours overall. gmond should
 collect the data every 1 or 2 seconds, so state changes can be immediately
 recognized and be dealt with.

So far, we have been studying the code and the rrdtool, and hopefully
 understood how this things work.

Before we play around with the variables, I'd like to ask your opinion:

1. can this be accomplished? My concern is that the overhead gets too
 much, and ganglia slows the whole cluster down (there must be a reason
 why the step variable of the gmetad is 15 seconds as default.)

2. is it sufficient to lower the „collect_every“ and
 „time_threshold“ variables in gmond.conf to speed up the data collection or 
isn't
 it that simple?

I want the the gmond write its output to the console to see what the
 collector does. So I change a value, stop and restart with „gmond -d9
 start“ but then get:
[...]
tcp_listen() on xml_port failed: Address already in use

How do I get the output without restarting the computer? (when I start
 gmond for the first time it works). I think I tried to restart gmetad,
 too, but problem remains.

Your help is greatly appreciated,
thanks,
Tom and Percy

-- 
Ist Ihr Browser Vista-kompatibel? Jetzt die neuesten 
Browser-Versionen downloaden: http://www.gmx.net/de/go/browser

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
Ganglia-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-developers




__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
Ganglia-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-developers

Reply via email to