If my server process happens to fail to startup, god does not report
the error back to me when I try to load it's config.  The following is
a very simple test configuration that seems to replicate my silent
failure.  "god status" reports "up" for the watch in question.  Any
help appreciated!

$ sudo /usr/bin/god -V
Version: 0.8.0
Polls: enabled
Events: netlink

$ python -c "raise RuntimeError('fail')"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
RuntimeError: fail

$ echo "$?"
1

$ sudo /usr/bin/god load fail.god
Sending 'load' command

The following tasks were affected:
  testfail-name

$ echo "$?"
0

$ sudo /usr/bin/god status
testfail:
  testfail-name: up

$ cat fail.god
God.watch do |w|
    w.pid_file = "/var/run/fail.pid"

    w.group = "testfail"
    w.name = "testfail-name"
    w.interval = 30.seconds
    w.start = "python -c \"raise RuntimeError('fail')\" "
    w.stop = "kill -TERM `cat #{w.pid_file}`"
    w.restart = "#{w.stop} && #{w.start}"
    w.start_grace = 10.seconds
    w.restart_grace = 10.seconds

    w.behavior(:clean_pid_file)

    w.start_if do |start|
      start.condition(:process_running) do |c|
        c.interval = 5.seconds
        c.running = false
      end
    end

    w.restart_if do |restart|
      restart.condition(:cpu_usage) do |c|
        c.above = 50.percent
        c.times = 5
      end
    end

    # lifecycle
    w.lifecycle do |on|
      on.condition(:flapping) do |c|
        c.to_state = [:start, :restart]
        c.times = 5
        c.within = 5.minute
        c.transition = :unmonitored
        c.retry_in = 10.minutes
        c.retry_times = 5
        c.retry_within = 2.hours
      end
    end
end

-- 
You received this message because you are subscribed to the Google Groups 
"god.rb" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/god-rb?hl=en.

Reply via email to