RE: [Linux-cluster] Monitoring Failovers

Jeremy Eder Fri, 20 Feb 2009 11:39:39 -0800

The cluster-snmp package doesn't provide what you need ?




Best Regards,

Jeremy Eder, RHCE, VCP



-----Original Message-----
From: [email protected] 
[mailto:[email protected]] On Behalf Of Burton Simonds
Sent: Friday, February 20, 2009 14:25 PM
To: linux clustering
Subject: Re: [Linux-cluster] Monitoring Failovers

I have 2 thoughts.  I was looking at the output of the clustat -l
command and i noticed it has a 'last transition' timestamp.  I was
thinking about looking at that and using it to create an alarm that
says "The last transition happened less than X minutes ago or
something like that.   It is a little sketchy, but could be a
possibility, and most of the code needed has already by the author of
check_rhcs script that i found, I believed authored by Chris St.
Pierre.  (If this is incorrect, please let me know as the author
should be credited.)

My second thought was a little more complicated and would require more
work, but basically using syslog-ng's parsing capiblities, I would
send all cluster service messages to a script that would parse and
look for failover messages.  The messages could be sent to NSCA.

I am going to try the first one, and see if it can meet my needs, then
maybe work on the second.

Thanks,
B

On Fri, Feb 20, 2009 at 11:41 AM, Martin Fuerstenau
<[email protected]> wrote:
> It is a little bit hard to do. It is on my todo list too. The problem is
> to determine the old state. So for example if you switch an ip address
> and you have a service bound to that address you have nearly no chance
> to monitor it from the Nagios side.
>
> I have tested using the MAC address and arp but this is awesome if you
> have bonding. Because if the MAC switches it may be the bonding of the
> cluster or the cluster switched. But hardcoded MAC addresses in the
> monitor script will not be good idea.
>
> Too much trouble in maintenance.
>
> If anyone has a good idea I will write the plugin and post it
> Nagiosexchange.
>
> Martin Fuerstenau
>
> On Fri, 2009-02-20 at 11:04 -0500, Burton Simonds wrote:
>> I am in the process of setting up Nagios for system monitoring, and I
>> would like to have a way to know if a failover has occurred.  If
>> everything works as it should, there be a minimal impact on the
>> services.  Right now it looks like my best bet is basically scrape the
>> logs and look for the failover messages there and trigger an alarm.
>>
>> I was wondering if anyone else has done anything.  I found in an
>> archive a check_rhcs script that I am going to employ (which looks
>> pretty cool), but that just looks at the status of the services.  I
>> want to either compare the current status to the previous status or
>> have something monitoring the cluster an pushes the alert to Nagios.
>>
>> Thanks,
>> B
>>
>> --
>> Linux-cluster mailing list
>> [email protected]
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>
> This message and attachment(s) are intended solely for use by the addressee 
> and may contain information that is privileged, confidential or otherwise 
> exempt from disclosure under applicable law.
>
> If you are not the intended recipient or agent thereof responsible for 
> delivering this message to the intended recipient, you are hereby notified 
> that any dissemination, distribution or copying of this communication is 
> strictly prohibited.
>
> If you have received this communication in error, please notify the sender 
> immediately by telephone and with a 'reply' message.
>
> Thank you for your co-operation.
>
>
> --
> Linux-cluster mailing list
> [email protected]
> https://www.redhat.com/mailman/listinfo/linux-cluster
>

--
Linux-cluster mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/linux-cluster

--
Linux-cluster mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/linux-cluster

RE: [Linux-cluster] Monitoring Failovers

Reply via email to