Hi Mike,

I wasted a whole day tracking down what I believe to be the same bug. This 
is the only discussion thread about the issue that I can find, so I want to 
put it out here.

If you start god without specifying a config file, the events module 
doesn't initialize properly, and god isn't notified when the processes quit.

Obviously the fix is as easy as adding `-c <config file>` to the command 
line you used to start god. I'll leave it to people who understand god's 
internals to explain why this would be the case.

Prathan

P.S.

$ sudo god -V

Version: 0.13.3

Polls: enabled

Events: netlink



On Saturday, May 4, 2013 7:38:07 AM UTC+7, Mike Fellows wrote:
>
> Apologies if the answer to this is obvious but I am having trouble getting 
> event based monitoring working on either OSX 10.7.5 or a RHEL Linux variant 
> (the latest AWS Linux AMI).
>
> In both cases *god check* is reporting success for event based 
> monitoring, and I have confirmed for the Linux OS that the 
> CONFIG_PROC_EVENTS option was set to 'y' when it was compiled.  Here is the 
> output from god check for OSX below, the Linux god check returns the same 
> output although it is using the 'netlink' event system.
>
> $ rvmsudo god check
> using event system: kqueue
> starting event handler
> forking off new process
> forked process with pid = 1504
> killing process
> [ok] process exit event received
>
>
> I've reduced a test case down to something very simple test case.  It is a 
> simplification of the example from http://godrb.com.  I also tried the 
> full sample file from the site with no luck either.  Here is the simplified 
> god file.
>
> root = File.dirname(File.expand_path(__FILE__))
>
> God.watch do |w|
>   w.name     = "simple_test"
>   w.interval = 30.seconds
>   w.start    = "#{root}/simple_test.rb"
>   w.log = "#{root}/simple_test.log"
>   w.uid = 'mike'
>   w.dir = root
>
>   # determine the state on startup
>   w.transition(:init, { true => :up, false => :start }) do |on|
>     on.condition(:process_running) do |c|
>       c.running = true
>     end
>   end
>
>   # determine when process has finished starting
>   w.transition([:start, :restart], :up) do |on|
>     on.condition(:process_running) do |c|
>       c.running = true
>     end
>
>     # failsafe
>     on.condition(:tries) do |c|
>       c.times = 5
>       c.transition = :start
>     end
>   end
>
>   # start if process is not running
>   w.transition(:up, :start) do |on|
>     on.condition(:process_exits)
>   end
> end
>
>
> And here is the output from god when it is started in foreground mode and 
> the god file is loaded.
>
> $ rvmsudo god -D --no-syslog --log-level debug
> I [2013-05-03 17:28:13]  INFO: Syslog disabled.
> I [2013-05-03 17:28:13]  INFO: Using pid file directory: /var/run/god
> I [2013-05-03 17:28:13]  INFO: Started on drbunix:///tmp/god.17165.sock
> I [2013-05-03 17:28:40]  INFO: simple_test Loaded config
> I [2013-05-03 17:28:40]  INFO: simple_test move 'unmonitored' to 'init'
> D [2013-05-03 17:28:40] DEBUG: driver schedule 
> #<God::Conditions::ProcessRunning:0x007f806081d570> in 0 seconds
> I [2013-05-03 17:28:40]  INFO: simple_test moved 'unmonitored' to 'init'
> I [2013-05-03 17:28:40]  INFO: simple_test [trigger] process is not 
> running (ProcessRunning)
> D [2013-05-03 17:28:40] DEBUG: simple_test ProcessRunning [false] 
> {true=>:up, false=>:start}
> I [2013-05-03 17:28:40]  INFO: simple_test move 'init' to 'start'
> I [2013-05-03 17:28:40]  INFO: simple_test start: 
> /Users/mike/code/STAT-HQ/test/god/simple_test/simple_test.rb
> D [2013-05-03 17:28:40] DEBUG: driver schedule 
> #<God::Conditions::ProcessRunning:0x007f806081fc58> in 0 seconds
> D [2013-05-03 17:28:40] DEBUG: driver schedule 
> #<God::Conditions::Tries:0x007f806081f8e8> in 0 seconds
> I [2013-05-03 17:28:40]  INFO: simple_test moved 'init' to 'start'
> D [2013-05-03 17:28:40] DEBUG: simple_test ProcessRunning [true] 
> {true=>:up}
> I [2013-05-03 17:28:40]  INFO: simple_test move 'start' to 'up'
> I [2013-05-03 17:28:40]  INFO: *simple_test registered 'proc_exit' event 
> for pid 1659*
> I [2013-05-03 17:28:40]  INFO: simple_test moved 'start' to 'up'
>
>
> If I then kill the process that god is supposed to be monitoring there is 
> no output from god and the simple_test.rb program is not restarted.  Here 
> is the output from killing the process.
>
> sparrow:simple_test mike$ rvmsudo god load simple_test.god 
> Sending 'load' command with action 'leave'
>
> The following tasks were affected:
>   simple_test
> $ ps -ef | grep simple_test.rb
>   501  *1659*     1   0  5:28pm ??         0:00.03 /usr/bin/ruby 
> /Users/mike/code/STAT-HQ/test/god/simple_test/simple_test.rb
>   501  1663   875   0  5:28pm ttys004    0:00.00 grep simple_test.rb
> $ kill *1659*
> $ ps -ef | grep simple_test.rb
>   501  1665   875   0  5:28pm ttys004    0:00.00 grep simple_test.rb
> $ ps -ef | grep simple_test.rb
>   501  1667   875   0  5:29pm ttys004    0:00.00 grep simple_test.rb
> $ ps -ef | grep simple_test.rb
>   501  1669   875   0  5:30pm ttys004    0:00.00 grep simple_test.rb
>
>
> I've read all the documentation and been through some of the source code 
> for god but I am not finding any obvious clues.  Does my god file look 
> reasonable?  Would anyone have any suggestions for other steps to take to 
> troubleshoot?
>
> As a postscript - I am able to get god to monitor and restart processes 
> using polling but I would prefer to use the event based technique.
>
> Thanks for any help.
>
> Regards,
> Mike
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"god.rb" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/god-rb.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to