On Mar 8, 2010, at 1:19 PM, Willy Tarreau wrote:

> Hi,
> 
> On Mon, Mar 08, 2010 at 02:58:14PM +0100, Stéphane Urbanovski wrote:
>>> Cool, thanks. Are you interested in getting it merged into
>>> mainline ? If so, we can create an entry into the "contrib"
>>> directory.
>> 
>> No objections, but I'm not sure it is the best place. A link in the README 
>> should be enough.
> 
> OK will do that then. In fact, projects that people want to maintain
> are more suited out of tree, and the ones that are written as one-shot
> and have no reason to change in the future are better merged (eg: the
> net-snmp plugin is a good example). That's why I asked.
> 
>>> Also, it seems to rely only on the HTTP socket. Do you think
>>> it can easily be adapted to also support the unix socket, which
>>> is global and does not require opening a TCP port ?
>> 
>> The plugin works with Nagios which is not installed on the same host. So a 
>> remote access in a way or other is mandatory.
> 
> hey, that obviously makes sense !
> 
I was looking at the Nagios script that Jean-Christophe wrote and I think there 
may be a need for something a bit different.

I feel that we might want to have the Nagios plugin monitor a specific 
frontend/backend service combination, rather than the entire HAProxy setup.  
This would be useful because we could focus on individual HAProxy services and 
their specific health.  If we try and write one plugin for all HAProxy services 
things start to get muddled.

Here is what I am thinking:
- Specify a listen and backend service(s) or a single listen service where the 
frontend and backend have the same name
- Specify thresholds for sessions, errors, queue, etc... Make this dynamic in 
case any fields change..
- Since most the stats are based on counters the nagios plugin would have to 
maintain persistent counters, most likely in an external file
- To keep the load down on the admin requests to haproxy perhaps we have the 
script cache the csv data and check it for freshness every run..  
- Specify thresholds for how many backend services can be up or down as a 
percentage, so like if 50% of backend services are down, go critical, if only 
25% of them are down make it a warning or something..
- Output performance data for sessions, errors, queue, etc.. so that we can 
trend and make pretty pictures

We could try and use Ton Voon's new multiple threshold syntax specified here.. 
http://nagiosplugins.org/rfc/new_threshold_syntax

That's about all I have for now..

Thoughts?

-Josh Brown

Reply via email to