On Dec 16, 12:54 pm, John Mettraux <[email protected]> wrote: > On Thu, Dec 15, 2011 at 03:26:28PM -0800, David Powell wrote: > > > We are using ruote-kit with ruote-redis and seem to be hitting a bug. > > We have a simple participant that touches a file, waits 20s, then > > "rewinds". This works fine for a while, but then usually once a day > > it stops. The "/_ruote" endpoint shows the "process" still exists, > > but there is no entry for it in "workitems" nor "schedule", and > > nothing is reported on the "error" page. > > > Has anyone else experience a similar problem? Any suggestions on how > > I should go about tracking down the issue? > > Hello David, > > welcome to ruote's mailing list. > > It'd be interesting to know exactly where it stops. It sounds like a "msg" > got lost. > > Sometimes it happens with corner cases in participant implementation, like in > > ---8<--- > class TouchParticipant > include Ruote::LocalParticipant > > def consume(workitem) > FileUtils.touch(workitem.fields['filename']) > reply_to_engine(workitem) > rescue > puts "something went wrong" > end > end > --->8--- > > If anything goes wrong in that participant, a message will be emitted to the > STDOUT and reply_to_engine will not be seen. > > Maybe it stops around your participant implementation. Placing some logging > output may help. > > It could also happen for some mysterious reason (defect in my implementation > of ruote-redis for example), helping determine where it really stops would > help. > > A way of fixing that is to re-apply the expression that was supposed to > reply. In ruote-kit this can be done from the dashboard (see attached image). > It's only a repair trick, it requires admin intervention. > > I'd be glad to help determining the cause. > > Best regards, >
Thanks for the prompt reply John. I added the rescue+logging as you suggested to our TouchParticipant. However, the workitem still just "stops" and the log is not triggered so something else is going wrong it seems. The re-apply trick does get it working again, but is obviously not ideal. I'm keen to try and pin down the problem. Unfortunately, this problem only occurs about every 24hrs, so debugging will be a pain. Do you have any suggestions on where to start looking? Cheers, -- David Powell -- you received this message because you are subscribed to the "ruote users" group. to post : send email to [email protected] to unsubscribe : send email to [email protected] more options : http://groups.google.com/group/openwferu-users?hl=en
