Hi Peter, thanks for your work on the forking portion of the Workflow engine. Your pain is shared, we are experiencing the same problems in our installation.
Overall, OpenXPKI is working properly, but we do have some memory leaks from left-over SHM segments that need to be cleaned up now and then. We also share the problem that once a parent dies, the child's notification of termination does simply not reach the destination. We also have the same problem that a server shutdown (or an unexpected error) can possibly leave workflows dangling in unfinished state, but this is a different story. Let's focus on the synchronization problem first. The current implementation can lead to situations where the parent is left in the WAITING_FOR_CHILD state while the child has already finished. In addition, forking workflows increase the server load (both CPU and memory usage). Some days ago I though about exactly the same problem and had an idea how to solve this. Let's go back to the reason why we implemented forked workflows: We wanted to be able to reuse existing workflows (i. e. cert issuance and publication) in other workflows without rewriting most of the workflow itself in a new workflow definition. The workflow engine does not support "sub-procedure" workflows, hence we developed the forking concept. Now, what would stop us from implementing the sub-procedure semantics ourselves? I think the following should work: Write a new Workflow Activity (similar to ForkWorkflowInstance). Lets call it ExecuteWorkflowInstance. In the context parameters of ExecuteWorkflowInstance we pass the desired Workflow type name and possibly some mapping instructions which context values of the parent workflow should be passed to the child workflow. Now the activity creates a new workflow instance of the desired type. The ExecuteWorkflowInstance activity now calls the necessary activities in the sub-instance. The sub-instance will likely contain mostly autorun states (as it is the case in the current candidates for forked instances), but this would not strictly be necessary. The ExecuteWorkflowInstance could handle calling activities on the sub-instance itself. Once the sub-workflow finishes, the ExecuteWorkflowInstance activity is done and allows proceeding to the next top-level workflow state. Advantages: - should be not too hard to implement (one additional workflow activity class plus config changes) - we could continue to use the existing semantics that link workflows (workflow_parent_id) and hence the GUI should continue to work as before - if designed properly we could even support nested WF calls (activity execution logic needs to be designed properly) - no more hassles with shared memory - no parallel and out-of-band execution anymore -> implicit serialization -> possibly higher number of parallel running WFs What do you think? A second, related task will be some logic to clean up or re-instantiate stalled workflows (e. g. after a server shutdown or unexpected error). I see problems with automatic detection and reaction - how do you determine if a workflow has been left in a stalled state deliberately or if it stalled due to an error...? We're also thinking about rebuilding the whole dang user frontend. Let's face it, the web UI sucks and badly needs to be redone... :) cheers Martin ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ OpenXPKI-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/openxpki-devel
