Author: Sebastian Bergmann Date: 2007-02-12 18:24:04 +0100 (Mon, 12 Feb 2007) New Revision: 4643
Log: - Sync with thesis paper. Modified: experimental/Workflow/design/design.txt Modified: experimental/Workflow/design/design.txt =================================================================== --- experimental/Workflow/design/design.txt 2007-02-09 07:07:09 UTC (rev 4642) +++ experimental/Workflow/design/design.txt 2007-02-12 17:24:04 UTC (rev 4643) @@ -4,79 +4,122 @@ :Revision: $Revision$ :Date: $Date$ -Design Description -================== +Workflow Model +============== -An Actvity-Based Workflow Management System answers the question -"Who must do what, when and how". When we think of a workflow as an -directed graph, then +Activities and Transitions +-------------------------- - - Nodes represent activity steps (WHAT). - - - Edges between nodes represent the flow (WHEN). - - - Nodes can require data or decision input (WHO). +The workflow model is activity-based [FG02]. The activities that are to be +completed throughout the workflow and the transitions between them are mapped to +the nodes and edges of a directed graph. This choice was made to faciliate the +application of the Graph-Oriented Programming paradigm for the implementation. +Using a directed graph as the foundation for the workflow model makes it +possible to define the syntax of the workflow description language using the +formalism of graph grammars [DQZ01]. - - Nodes can have service objects (HOW) attached to them. +Graph Traversal and Execution Strategy +-------------------------------------- -This component provides an abstract virtual machine for -Graph-Oriented Programming (GOP) with PHP. This virtual machine -provides the building blocks (Workflow Patterns) for graph-based -execution languages such as workflow definition languages. This -separation of backend language and (optional) frontend languages -accomodates for the fact that there is no real standard workflow -definition language. +The execution of a workflow starts with the graph's only Start node. A graph may +have one or more End nodes that explicitly terminate the workflow execution. -The backend language provides the necessary building blocks in the -form of the ezcWorkflowNode classes. Objects of these classes can be -connected to form an execution graph that can be executed. Such an -execution graph can be interactive, which means that it requires user -interaction at one or more points in during execution, or -non-interactive. -When the execution of an interactive workflow reaches a node that -requires user interaction, the execution enters a wait state and is -suspended. It is resumed when the input from the user interaction is -available. +After a node has finished executing, it can activate one or more of its possible +outgoing nodes. Activation adds a node to a set of nodes that are waiting for +execution. During each execution step, a node from this set is executed. When +the execution of a node has been completed, the node is removed from the set. -The aforementioned building blocks resemble the so-called Workflow -Patterns that form both a list of requirements for workflow languages -as well a vocabulary to compare workflow languages. They are -discussed in [1]. +The workflow execution is implicitly terminated when no nodes are activated +and no more nodes can be activated. -Graph-Oriented Programming is an implementation technique for -graph-based execution languages on top of an object-oriented -programming language. It basically adds the concept of wait states -to an object-oriented programming language and with that the ability -to suspend and resume execution. +State and Workflow Variables +---------------------------- -The general theme for this component is extensibility and flexibility. -The Workflow component in itself provides only in-memory definition -and execution of workflows. The saving of such an object-based graph -definition to a storage container (relational database, XML document) -and the loading of a workflow definition from a storage container into -an object-based graph are delegated to tie-in components. The same -applies to other aspects of a workflow management system such as -persistence (keeping the state of an executing workflow). This loose -coupling of components allows for building flexible workflow management -solutions. Concerns such as logging the workflow execution can be -implemented using ideas from Aspect-Oriented Programming. +The workflow model supports state through the concept of workflow variables. +Such a variable can either be requested as user input (from an Input node) or be +set and manipulated through the VariableSet, VariableAdd, VariableSub, +VariableMul, VariableDiv, VariableIncrement, and VariableDecrement nodes. -The WorkflowDatabaseTiein component, for instance, provides the -functionality needed to load and save workflow definition from/to a -relational database as well as a persistence implementation that uses -a relational database based upon the Database component. +While a VariableSet node may set the value of a workflow variable to any type +that is supported by PHP, the other variable manipulation nodes only operate on +numeric values. -The idea of a set of loosely components to build a workflow -management system comes from [2]. -The idea for the separation of isolating the core of the workflow -management system from the workflow description language is described -in [3]. It means that a frontend workflow description language is -"compiled" to a backend representation that can be executed by the -workflow management system. +Variables are bound to the scope of the thread in which they were defined. This +allows parallel threads of execution to use variables of the same name without +side effects. -The diagram below shows the architecture of a workflow management -system built using this component. +When the execution of a workflow reaches an Input node (see above), the +execution is suspended until such time when the user input has been provided and +the execution can be resumed. +Control Flow +------------ + +The control flow semantics of the workflow model draws upon the workflow +patterns from [BK03]. The Sequence, Parallel Split, Synchronization, Exclusive +Choice, Simple Merge, Multi-Choice, Synchronizing Merge, and Discriminator +workflow patterns are all directly supported by the workflow model. + +Exclusive Choice and Multi-Choice nodes have branching conditions attached to +them that operate on workflow variables to make their control flow decisions. + +The workflow model supports one workflow construct, loops, that is not based on +a workflow pattern. Loops are represented by a pair of LoopStart and LoopEnd +nodes that bracket the loop body. Both start and end node of a loop can have +break conditions attached to them that operate on workflow variables. Inside the +loop body, VariableAdd, VariableSub, VariableMul, VariableDiv, +VariableIncrement, and VariableDecrement nodes can be used, for instance, to +model for or while loops. + +Action Nodes and Service Objects +-------------------------------- + +So far we have only discussed nodes that control the flow and that can +manipulate workflow variables. We are still missing a type of nodes that +actually performs an activity. This is where the Action node comes into play. + +When the execution of a workflow reaches an Action node, the business logic of +the attached service object is executed. Service Objects live in the domain of +the application into which the workflow engine is embedded. They have read and +write access to the workflow variables to interact with the rest of the +workflow. + +Sub-Workflows +------------- + +The workflow model supports sub-workflows to break down a complex workflow into +parts that are easier to conceive, understand, maintain, and which can be +reused. + +A sub-workflow is started when the respective Sub-Workflow node is reached +during workflow execution. The execution of the parent workflow is suspended +while the sub-workflow is executing. It is resumed once the execution of the +sub-workflow has ended. + + +Software Design +=============== + +Architecture +------------ + +The workflow engine been designed and implemented as three loosely coupled +components. The Workflow component provides an object-oriented framework to +define workflows and an execution engine to execute them. +The WorkflowDatabaseTiein and WorkflowEventLogTiein components tie the Database +and EventLog components into the main Workflow component for persistence +and monitoring, respectively. + +A workflow can be defined programmatically by creating and connecting objects +that represent control flow constructs. The classes for these objects are +provided by the Workflow Definition API. This API also provides the +functionality to save workflow definitions (ie. object graphs) to and load +workflow definitions from a data storage. Two data storage backends have been +implemented, one for relational database systems and another for XML files. +Through the Workflow Execution API the execution of a workflow definition can be +started (and resumed). The figure below shows the conceptual architecture for +the workflow engine. + +---------+ +---------+ +------------------------+ | GUI | <=> | XML | | Mail, SOAP, ... | +---------+ +---------+ +------------------------+ @@ -99,284 +142,172 @@ | Data Storage | +-----------------------------------------------------+ -1a. A workflow definition can be loaded from an XML document that - uses one of the many workflow markup languages. This XML - representation of the workflow needs to "compiled" into an object - graph representation. +The idea that a workflow system should be comprised of loosely coupled +components is discussed, for instance, in [DAM01,DG95,PM99]. Manolescu +states that -1b. A workflow definition can be created using a GUI that creates an - object graph using the Workflow Definition API provided by the - component. + "an object-oriented workflow architecture must provide abstractions + that enable software developers to define and enact how the work flows through + the system" [DAM01]. -2. The object graph representing the workflow definition is saved for - later execution to the data storage. - -3. Using the Workflow Execution API the execution of a workflow is - started. The workflow definition is read from the data storage and - the corresponding object graph is created. - - The execution starts and continues until a wait state is reached - (and no other state can be executed). The state of the execution - is then saved to the data storage and the execution is suspended. +Like Manolescu's Micro-Workflow architecture, the Workflow components +encapsulate workflow features in separate components following the +Microkernel pattern which -4a. The Workflow Execution API exposes functionality to check whether - any workflow that is currently suspended is waiting for input from - a given user. The input can then be requested via a web form, for - instance. + "applies to software systems that must be able to adapt to changing + system requirements. It separates a minimal functional core from extended + functionality and customer-specific parts. The microkernel also serves as a + socket for plugging in these extensions and coordinating their + collaboration" [FB96]. -4b. Alternatively, a node in the workflow graph that requires user - interaction can send an email with links that act input triggers. - Clicking on such a link provides the neccessary information to - continue the workflow execution. +The minimalistic core of the Workflow component is provides basic workflow +functionality: +- The Workflow Definition API implements an activity-based workflow model and + provides the abstractions required to build workflows. +- The Workflow Execution API implements the functionality to execute workflows. -Main Classes -============ +On top of these core components other components, for instance for persistence +(suspending and resuming workflow execution), monitoring (status of running +workflows), history (history of executed workflows), and worklist management +(human-computer interface), can be implemented. Each of these components +encapsulates a design decision and can be customized or replaced. -- ezcWorkflow - An object of this class represents a workflow. +Workflow Virtual Machine +------------------------ - Creating a new ezcWorkflow object automatically creates objects - of the ezcWorkflowNodeStart and ezcWorkflowNodeEnd classes (see - below). +Given the fact that standardization efforts, e.g. XPDL [WfMC05] proposed by the +Workflow Management Coalition, have essentially failed to gain universal +acceptance [WA04], the problem of developing a workflow system that supports +changes in the workflow description language needs to be addressed. - $workflow = new ezcWorkflow; - $start = $workflow->getStartNode(); - $end = $workflow->getEndNode(); +Fernandes et. al. propose to -- ezcWorkflowNode + "split the workflow system into two layers: (1) a layer implementing a + Workflow Virtual Machine, which is responsible for most of the workflow system + activities; and (2) a layer where the different workflow description languages + are handled, which is responsible for making the mapping between each workflow + description language and the Workflow Virtual Machine" [SF04]. - This is the abstract base class for the individual node classes. - It provides the neccessary functionality to connect node objects - to an object graph that represents a workflow. +The Workflow component's Workflow Execution API implements such a workflow +virtual machine and isolates the executing part of a workflow management system, +the backend, from the parts that users interact with, the frontend. This +isolation allows for the definition of a backend language to describe exactly +the workflows that are supported by the executer and its underlying workflow +model. This backend language is not the workflow description language users use +to define their workflows. They use frontend languages that can be mapped to the +system's backend language. - $a->addOutNode( $b ) - and +Graph-Oriented Programming +-------------------------- - $b->addInNode( $a ) +Graph-Oriented Programming [JBOSS] implements the graphical representation and +the wait states of a process language in an object-oriented programming +language. The former can be achieved by providing a framework of node classes. +Objects of these classes represent the nodes in the process graph, relations +between these objects represent the edges. Such an object graph can then be +traversed for execution. These executions need to be persistable, for instance +in a relational database, to support the wait states. - are semantically equivalent in that they connect the $a and $b nodes - in such a way that when the execution leaves the $a node the $b node - is executed next. This implements the first workflow pattern, - Sequence. +The aforementioned node classes implement the Command design pattern [GoF94] and +encapsulate an action and its parameters. - The removeInNode( $node ) and removeOutNode( $node ) methods are provided - to disconnect nodes. +The executing part of the workflow engine is implemented in an Execution class. +An object of this class represents a workflow in execution. The execution object +has a reference to the current node. When the execution of a workflow is +started, a new execution object is created and the current node is set to the +workflow's start node. The execute() method that is to be provided by the node +classes is not only responsible for executing the node's action, but also for +propagating the execution: a node can pass the execution that arrived in the +node to one of its leaving transitions to the next node. - The verify() method checks whether the node is valid in that it matches - the constraints declared in the corresponding node class' declaration. +Like Fowler in [MF05], the authors of the JBoss jBPM manual [JBOSS] acknowledge +the fact that current software development relies more and more on domain +specific languages. They see Graph-Oriented Programming as a means to implement +domain specific languages that describe how graphs can be defined and executed +on top of an object-oriented programming language. - The ezcWorkflowNodeStart class (see below), for instance, declares +In this context, a process language (like a workflow description language) is +nothing more than a set of Node classes. The semantics of each node are defined +by the implementation of the execute() method in the respective node class. This +language can be used as the backend language of a Workflow Virtual Machine. In +this lanugage, the workflow is represented as a graph of command objects. - protected $minInNodes = 0; - protected $maxInNodes = 0; - protected $minOutNodes = 1; - protected $maxOutNodes = 1; +One of the advantages of using a domain specific language that Fowler gives in +[MF05 regards the involvement of lay programmers: domain experts who are not +professional programmers but program in domain specific languages as part of the +development effort. In essence this means that a software system that provides a +domain specific language can be customized and extended without knowledge of the +underlying programming language that was used to implement it. - to express that it needs exactly one outgoing node (and no incoming - node) to be valid. - - ezcWorkflowNodeStart +Summary +------- - An object of this class marks the start node of a workflow. A - workflow contains exactly one start node. +The core of the workflow engine is a virtual machine that executes workflows +represented through object graphs. These object graphs can be created +programmatically through the software component's Workflow Definition API. +Alternatively, a workflow definition can be loaded from an XML file. Object +graph and XML file are two different representations of a workflow definition +that uses the so-called backend language of the workflow engine's core. +Arbitrary frontend languages such as the XML Process Definition Language (XPDL) +[WfMC05], for instance, can be mapped to the workflow engine's backend language. - - ezcWorkflowNodeEnd - An object of this class marks an end node of a workflow. A - workflow contains at least one end node. +Bibliography +============ - - ezcWorkflowNodeAction +- [BK03] Bartosz Kiepuszewski. + Expressiveness and Suitability of Languages for Control Flow Modelling in Workflows. + PhD Thesis, Faculty of Information Technology, Queensland University of Technology, Australia, 2003. - This type of node executes a user-defined action. Such an action - is encapsulated by a class that implements the - ezcWorkflowServiceObject interface. +- [DAM01 Dragos-Anton Manolescu. + Micro-Workflow: A Workflow Architecture Supporting Compositional Object-Oriented Software Development. + PhD Thesis, Department of Computer Science, University of Illinois at Urbana-Champaign, USA, 2001. - class PrintSomething implements ezcWorkflowServiceObject - { - public function execute( ezcWorkflowExecution $execution ) - { - print 'something'; - } +- [DG95] Dimitrios Georgakopoulos and Mark F. Hornick and Amit P. Sheth. + An Overview of Workflow Management: From Process Modeling to Workflow Automation Infrastructure. + In: Distributed and Parallel Databases, Volume 3, Number 2, Pages 119--153, 1995. - public function __toString() - { - return 'PrintSomething'; - } - } +- [DQZ01] Da-Qian Zhang and Kang Zhang and Jiannong Cao. + A Context-Sensitive Graph Grammar Formalism for the Specification of Visual Languages. + In: The Computer Journal, Volume 33, Number 3, Pages 186--200, 2001. - $print = new ezcWorkflowNodeAction( 'PrintSomething' ); +- [FB96] Frank Buschmann and Regine Meunier and Hans Rohnert and Peter Sommerlad and Michael Stahl. + Pattern-Oriented Software Architecture -- A System of Patterns. + John Wiley \& Sons, 1996. - $start->addOutNode( $print ); - $end->addInNode( $print ); +- [FG02] Florent Guillaume. + Trying to unify Entity-based and Activity-based workflows. + http://wiki.zope.org/zope3/TryingToUnifiyWorkflowConcepts - - ezcWorkflowNodeInput +- [GoF94] Erich Gamma and Richard Helm and Ralph Johnson and John Vlissides. + Design Patterns: Elements of Reusable Object-Oriented Software. + Addison-Wesley, 1994. - This type of node is used to model user interaction. +- [JBOSS] The JBoss Project. + JBoss jBPM: Workflow and BPM Made Practical. + http://docs.jboss.com/jbpm/v3/userguide/graphorientedprogramming.html - The example below creates an ezcWorkflowNodeInput node that expects - an input field named "choice". The value of this field has to be - boolean. +- [MF05] Martin Fowler. + Language Workbenches: The Killer-App for Domain Specific Languages? + June, 2005. + http://martinfowler.com/articles/languageWorkbench.html - $input = new ezcWorkflowNodeInput( - array( - 'choice', - new ezcWorkflowNodeInputConstraintBoolean - ) - ); +- [PM99] Peter Muth and Jeanine Weisenfels and Michael Gillmann and Gerhard Weikum. + Integrating Light-Weight Workflow Management Systems within Existing Business Environments. + In: Proceedings of the 15th International Conference on Data Engineering, March 1999, Sydney, Australia. - When this node is executed it will check whether the "input" field - is available (in the ezcWorkflowExecution object). If that is not the - case then the workflow execution is suspended. +- [SF04] Sérgio Fernandes and João Cachopo and António Rito-Silva. + Supporting Evolution in Workflow Definition Languages. + In: Proceedings of the 20th Conference on Current Trends in Theory and Practice of Computer Science (SOFSEM 2004), Springer-Verlag, 2004. - - ezcWorkflowNodeParallelSplit +- [WA04] W. M. P. van der Aalst and L. Aldred and M. Dumas and A. H. M. ter Hofstede. + Design and Implementation of the YAWL System. + In: Proceedings of the 16th International Conference on Advanced Information Systems Engineering (CAiSE 2004), June 2004, Riga, Latvia. - - ezcWorkflowNodeBranch - - This class implements the Parallel Split (AND-Split) workflow pattern. - - A single thread of control splits into multiple threads of control - which can be executed in parallel. - - The example below creates an ezcWorkflowNodeParallelSplit node that - unconditionally activates two nodes, $foo and $bar, when it is reached. - - $branch = new ezcWorkflowNodeParallelSplit; - $branch->addInNode( $start ); - $branch->addOutNode( $foo ); - $branch->addOutNode( $bar ); - - - ezcWorkflowNodeMultiChoice - - This class implements the Multi-Choice (OR-Split) workflow pattern. - - Based on a decision or workflow control data, a number of several - branches are chosen. - - - ezcWorkflowNodeExclusiveChoice - - This class implements the Exclusive Choice (XOR-Split) workflow pattern. - - Based on a decision or workflow control data, one of several - branches is chosen. - - This is a special case of the Multi-Choice (OR-Split) pattern. - - The example below creates an ezcWorkflowNodeBranch node that - conditionally activates one of two nodes, $true and $false, - when it is reached. - - $branch = new ezcWorkflowNodeExclusiveChoice; - $branch->addInNode( $input ); - - $branch->addConditionalOutNode( - new ezcWorkflowConditionIsTrue( - 'choice' - ), - $true - ); - - $branch->addConditionalOutNode( - new ezcWorkflowConditionIsFalse( - 'choice' - ), - $false - ); - - Branching conditions are expressed using ezcWorkflowCondition objects. - These operate on fields ('choice' in the example above) that have been - filled by a previously executed ezcWorkflowNodeInput node. - - Below is a list of the available ezcWorkflowCondition classes: - - - ezcWorkflowConditionIsEqual( String $inputField, Mixed $value ) - - ezcWorkflowConditionIsNotEqual( String $inputField, Mixed $value ) - - ezcWorkflowConditionIsGreaterThan( String $inputField, Mixed $value ) - - ezcWorkflowConditionIsEqualOrGreaterThan( String $inputField, Mixed $value ) - - ezcWorkflowConditionIsLessThan( String $inputField, Mixed $value ) - - ezcWorkflowConditionIsEqualOrLessThan( String $inputField, Mixed $value ) - - ezcWorkflowConditionIsTrue( String $inputField ) - - ezcWorkflowConditionIsFalse( String $inputField ) - - These can be combined using the following boolean operators: - - - ezcWorkflowConditionNot( ezcWorkflowCondition $condition ) - - ezcWorkflowConditionAnd( ezcWorkflowCondition[] $conditions ) - - ezcWorkflowConditionOr( ezcWorkflowCondition[] $conditions ) - - ezcWorkflowConditionXor( ezcWorkflowCondition[] $conditions ) - - ezcWorkflowNodeParallelSplit, ezcWorkflowNodeMultiChoice, and - ezcWorkflowNodeExclusiveChoice nodes can have both conditional and - unconditional outgoing nodes. - - - ezcWorkflowNodeSynchronization - - This class implements the Synchronization workflow pattern. - - Multiple parallel activities converge into one single thread - of control. - - - ezcWorkflowNodeSimpleMerge - - This class implements the Simple Merge workflow pattern. - - Two or more alternative branches come together without - synchronization. - -- ezcWorkflowExecution - - ezcWorkflowExecution is the abstract base class for workflow executers. It - provides the main execution loop as well as extension points for starting, - suspending, resuming, and ending the execution of a workflow. - - ezcWorkflowExecutionNonInteractive is an implementation of - ezcWorkflowExecution that can execute workflows that require no input and - that do not have sub-workflows. - - ezcWorkflowDatabaseExecution (from the WorkflowDatabaseTiein component) is - able to execute workflows that require input and that have sub-workflows - by suspending workflows to and resuming workflows from a relational - database, respectively. - -- ezcWorkflowVariableHandler - - ezcWorkflowVariableHandler is an interface that can be implemented to handle - custom variable that should be available during the execution of the - workflow, for instance for use in branch nodes. - - class UserHandler implements ezcWorkflowVariableHandler - { - public function load( $variableName ) - { - // ... - } - - public function save( $variableName, $value ) - { - // ... - } - } - - $workflow = new ezcWorkflow; - $workflow->addVariableHandler( - 'user_id', - 'UserHandler' - ); - -- ezcWorkflowExecutionListener - - ezcWorkflowExecutionListener is the interface that is to be implemented - by objects that want to observe the execution of a workflow. - - ezcWorkflowEventLogListener (from the WorkflowEventLogTiein component) is - the reference implementation of this interface. - --- -[1] W.M.P. van der Aalst, A.H.M. ter Hofstede, B. Kiepuszewski, - and A.P. Barros. Workflow Patterns -[2] Dragos A. Manolescu. An Extensible Workflow Architecture with - Objects and Patterns. -[3] Sérgio Miguel Fernandes, João Cachopo, and António Rito Silva. - Supporting Evolution in Workflow Design Languages. +- [WfMC05] Workflow Management Coalition. + Workflow Process Definition Interface -- XML Process Definition Language (XPDL). + Document Number WFMC-TC-1025, 2005. -- svn-components mailing list svn-components@lists.ez.no http://lists.ez.no/mailman/listinfo/svn-components