Hi, I've written a custom condition that tries to make a request via ActiveRecord to a MySQL server, and restarts the server if it throws an exception, in order to detect overloading problems which were causing our DB to grind to a halt without actually stopping.
I also have a flapping condition in a lifecycle block to stop
monitoring the server if it keeps failing, as per the usual examples.
(See attached file mysql.god)
As I understand it, a condition like the one I used:
w.lifecycle do |on|
on.condition(:flapping) do |c|
c.to_state = [:start, :restart]
c.times = 5
c.within = 10.minutes
c.transition = :unmonitored
c.retry_in = 10.minutes
c.retry_times = 5
c.retry_within = 2.hours
end
end
should cause the server to go unmonitored if it is started or
restarted 5 times in 10 minutes.
However, if I simulate a problem with the server by removing the table
it's checking, which isn't fixed by restarting the server, it
continues to monitor (and restart) indefinitely. Looking at the log
(see attached god.log) I can see the following line at the appropriate
time:
I [2010-03-21 11:44:20] INFO: mysqld auto-reenable monitoring in 600 seconds
but as you can see from the log, the monitoring and restarting
continues at the same rate.
(By the way, ignore the mailer warnings, no MTA on the box I used.)
Anybody know what I'm doing wrong? Has anyone got a working example of
flapping detection via a custom condiiton?
Many thanks!
Andrew.
--
:: http://biotext.org.uk/ ::
--
You received this message because you are subscribed to the Google Groups
"god.rb" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/god-rb?hl=en.
god.log
Description: Binary data
mysql.god
Description: Binary data
