Hello,

We've been using the latest version of ruote w/ ruote-mon using 2 workers 
for about 6 months now.

Over time as our process definitions have become more complex and longer 
running they have also become less reliable.

Many processes are getting 'stuck' - where they never enter the error state 
and also fail to respond to cancel.

I’ve been Using Ruote-kit to monitor and clean up these processes which 
usually works.

In the case where a process is 'stuck' and I attempt to kill it, the 
process changes to the 'dying' state, and never gets removed from the list.

This seems to happen around calls to subprocesses where I attempt to use 
the ‘pass’ expression for on_error and on_timeout:

 cursor :timeout => '${v:timeout}', :on_timeout => :pass, :tag => 
'wait_for_fqdn_discovery' do

   get_machine_fqdn

   sequence :unless => '${f:machine_fqdn}' do

     log 'waiting 60s' => '${f:machine.machine_id}'

     wait '60s' 

     rewind

   end 

 end 



refresh_state :on_error => 'pass'

Am I doing this incorrectly?

-- 
-- 
you received this message because you are subscribed to the "ruote users" group.
to post : send email to [email protected]
to unsubscribe : send email to [email protected]
more options : http://groups.google.com/group/openwferu-users?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"ruote" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to