Hi,

I've written a custom condition that tries to make a request via
ActiveRecord to a MySQL server, and restarts the server if it throws
an exception, in order to detect overloading problems which were
causing our DB to grind to a halt without actually stopping.

I also have a flapping condition in a lifecycle block to stop
monitoring the server if it keeps failing, as per the usual examples.

(See attached file mysql.god)

As I understand it, a condition like the one I used:

    w.lifecycle do |on|
        on.condition(:flapping) do |c|
            c.to_state = [:start, :restart]
            c.times = 5
            c.within = 10.minutes
            c.transition = :unmonitored
            c.retry_in = 10.minutes
            c.retry_times = 5
            c.retry_within = 2.hours
        end
    end

should cause the server to go unmonitored if it is started or
restarted 5 times in 10 minutes.

However, if I simulate a problem with the server by removing the table
it's checking, which isn't fixed by restarting the server, it
continues to monitor (and restart) indefinitely. Looking at the log
(see attached god.log) I can see the following line at the appropriate
time:

I [2010-03-21 11:44:20]  INFO: mysqld auto-reenable monitoring in 600 seconds

but as you can see from the log, the monitoring and restarting
continues at the same rate.

(By the way, ignore the mailer warnings, no MTA on the box I used.)

Anybody know what I'm doing wrong? Has anyone got a working example of
flapping detection via a custom condiiton?

Many thanks!

Andrew.

-- 
:: http://biotext.org.uk/ ::

-- 
You received this message because you are subscribed to the Google Groups 
"god.rb" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/god-rb?hl=en.

Attachment: god.log
Description: Binary data

Attachment: mysql.god
Description: Binary data

Reply via email to