On Fri, Jan 20, 2012 at 12:11 PM, John Mettraux <[email protected]> wrote:
>
> I hope this helps, best regards,
>
>
Thanks for the suggestion John we have tried that, but that doesn't seem to
be the problem.

We have done some digging in the ruote code and pinned down why we cannot
'reapply' some expressions.  The problem is that some expressions have
"children" but those children don't exist in the storage.  So, when it
tries to re-apply, it issues a "cancel" which propagates down through the
children.  But then, it attempts to cancel a child that does not exist thus
the child never issues a reply_to_parent.

We were able to fix our problematic workitems by writing some code that
runs through each child and checks if it really exists.  Something like the
following, which I'm sure is rough:

  def remove_nonexistent_children(exp)
    cs = exp.children
    ok = cs.select do |child|
      child_exp = RuoteKit.engine.fetch_flow_expression(child)
      if child_exp
        remove_nonexistent_children(child_exp)
        true
      else
        puts "Removing : #{child}"
        false
      end
    end

    if ok != cs
      exp.h["children"] = ok
      exp.persist_or_raise
    end
  end

  Document.stalled.each do |doc|
    puts "Re-applying document #{doc.id}"
    fexp = RuoteKit.engine.process(doc.wfid).expressions.last
    remove_nonexistent_children(fexp)
    RuoteKit.engine.re_apply(fexp.fei)
  end




The root cause of the problem is still unclear.  The evidence points to the
fact that some messages are lost by the redis backend - although I don't
know where.  We have noticed when we attempt to launch many workitems at
once, then invariably some go missing.  We have done tests using sequel as
a backend and this seems to solve that issue.  We have tried setting
'do_not_thread' on our all participants which did reduce the problem
somewhat but didn't solve it completely.

At this stage we are going to migrate to the sequel backend instead.
 Thanks for your help.

Cheers,
--
David Powell

-- 
you received this message because you are subscribed to the "ruote users" group.
to post : send email to [email protected]
to unsubscribe : send email to [email protected]
more options : http://groups.google.com/group/openwferu-users?hl=en

Reply via email to