Just as a followup, I took a look at the output of the clustat -x, and
one of the values is "last transition".  I wrote a check that looks at
a given service and then calculates the difference between the current
time and the last transition.  If that time is lower than a given
threshold, it alarms.  it is kind of a hack, but will do until I can
get a scripts and log parsing checks to have a little more proactive
approach.

B

On Fri, Feb 20, 2009 at 7:17 PM, Burton Simonds
<[email protected]> wrote:
> I was actually looking in Google for something like that earlier
> today.   That would work, but still has the issue of tracking the
> previous state.  From what I have read about the clustered services
> checks, is that it will see if the service is running somewhere, but
> will not notify if the service has changed state.  I am running NRPE
> on the clustered hosts and using that to check the processes on each
> of the hosts.
>
> I am looking at setting up the cluster-snmp stuff, and I will see if
> that will provide me with the information I need.  Otherwise, I might
> just go with log scraping.
>
> B
> On Fri, Feb 20, 2009 at 5:55 PM, eric rosel <[email protected]> 
> wrote:
>> Hi List,
>>
>> I've been toying with the idea of writing an init script resource which will 
>> send an alert to <type your favorite network/host monitoring system here> 
>> everytime it gets called with a "start" or "stop" argument.
>>
>> Another way is to make it send "alive" messages everytime it's called with 
>> "status", and then configure your monitoring app to sound the sirens when it 
>> stops getting those messages, or if the source of those messages changes.
>>
>> One then simply has to include this script resource with a clustered service.
>>
>> -eric
>>
>>
>> --- On Sat, 2/21/09, Martin Fuerstenau <[email protected]> wrote:
>>
>>> From: Martin Fuerstenau <[email protected]>
>>> Subject: Re: [Linux-cluster] Monitoring Failovers
>>> To: "linux clustering" <[email protected]>
>>> Date: Saturday, February 21, 2009, 12:41 AM
>>> It is a little bit hard to do. It is on my todo list too.
>>> The problem is
>>> to determine the old state. So for example if you switch an
>>> ip address
>>> and you have a service bound to that address you have
>>> nearly no chance
>>> to monitor it from the Nagios side.
>>>
>>> I have tested using the MAC address and arp but this is
>>> awesome if you
>>> have bonding. Because if the MAC switches it may be the
>>> bonding of the
>>> cluster or the cluster switched. But hardcoded MAC
>>> addresses in the
>>> monitor script will not be good idea.
>>>
>>> Too much trouble in maintenance.
>>>
>>> If anyone has a good idea I will write the plugin and post
>>> it
>>> Nagiosexchange.
>>>
>>> Martin Fuerstenau
>>>
>>> On Fri, 2009-02-20 at 11:04 -0500, Burton Simonds wrote:
>>> > I am in the process of setting up Nagios for system
>>> monitoring, and I
>>> > would like to have a way to know if a failover has
>>> occurred.  If
>>> > everything works as it should, there be a minimal
>>> impact on the
>>> > services.  Right now it looks like my best bet is
>>> basically scrape the
>>> > logs and look for the failover messages there and
>>> trigger an alarm.
>>> >
>>> > I was wondering if anyone else has done anything.  I
>>> found in an
>>> > archive a check_rhcs script that I am going to employ
>>> (which looks
>>> > pretty cool), but that just looks at the status of the
>>> services.  I
>>> > want to either compare the current status to the
>>> previous status or
>>> > have something monitoring the cluster an pushes the
>>> alert to Nagios.
>>> >
>>> > Thanks,
>>> > B
>>
>>
>>
>>
>>
>> --
>> Linux-cluster mailing list
>> [email protected]
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>

--
Linux-cluster mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/linux-cluster

Reply via email to