If my server process happens to fail to startup, god does not report
the error back to me when I try to load it's config. The following is
a very simple test configuration that seems to replicate my silent
failure. "god status" reports "up" for the watch in question. Any
help appreciated!
$ sudo /usr/bin/god -V
Version: 0.8.0
Polls: enabled
Events: netlink
$ python -c "raise RuntimeError('fail')"
Traceback (most recent call last):
File "<string>", line 1, in <module>
RuntimeError: fail
$ echo "$?"
1
$ sudo /usr/bin/god load fail.god
Sending 'load' command
The following tasks were affected:
testfail-name
$ echo "$?"
0
$ sudo /usr/bin/god status
testfail:
testfail-name: up
$ cat fail.god
God.watch do |w|
w.pid_file = "/var/run/fail.pid"
w.group = "testfail"
w.name = "testfail-name"
w.interval = 30.seconds
w.start = "python -c \"raise RuntimeError('fail')\" "
w.stop = "kill -TERM `cat #{w.pid_file}`"
w.restart = "#{w.stop} && #{w.start}"
w.start_grace = 10.seconds
w.restart_grace = 10.seconds
w.behavior(:clean_pid_file)
w.start_if do |start|
start.condition(:process_running) do |c|
c.interval = 5.seconds
c.running = false
end
end
w.restart_if do |restart|
restart.condition(:cpu_usage) do |c|
c.above = 50.percent
c.times = 5
end
end
# lifecycle
w.lifecycle do |on|
on.condition(:flapping) do |c|
c.to_state = [:start, :restart]
c.times = 5
c.within = 5.minute
c.transition = :unmonitored
c.retry_in = 10.minutes
c.retry_times = 5
c.retry_within = 2.hours
end
end
end
--
You received this message because you are subscribed to the Google Groups
"god.rb" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/god-rb?hl=en.