Re: Experience with workflow at Hippo Webworks

Johan Stuyts Mon, 08 Mar 2004 02:13:01 -0800

On Fri, 05 Mar 2004 17:18:53 -0500, Stefano Mazzocchi <[EMAIL PROTECTED]> wrote:

Johan Stuyts wrote:

Experience with workflow at Hippo Webworks
==========================================

At Hippo we used OSWorkflow to implement a workflow solution in a demo.
Below are our experiences.

As people with different levels of experience are interested in workflow
I will start with a (very) brief introduction to workflow.

Workflow introduction
---------------------
Very simply put workflow serves two purposes:
- to determine who can do what at which time with an object;
- to generate a list of pending tasks for users.

An example of the first is that an editor (who) can only publish (do
what) a document (an object) after a writer has asked for a review (at
which time).

The lists of documents to be reviewed is an example of a pending task
list for an editor.

Each object type can have its own specific workflow.

The demo workflow
-----------------
The demo we created has a workflow for a basic document type, a web
page. I have attached a diagram of it.

A document gets created by a writer. The writer is not allowed to
publish his document directly, he has to ask the editor for review.

The editor can easily review documents because we generate a list of
documents waiting for review. The editor can click on the document and
can either approve or disapprove. If the document gets approved it is
published on the public server.

If the document gets disapproved the writer can not ask for a review
without editing it first. Editing the document when it has been approved
will bring the document back to the editing state too. After making his
changes the user can ask for a review of the new version.

Implementation
--------------
For the document repository we use Slide. For the workflow engine we
used OSWorkflow. We connected these two using Slide interceptors.

wow, supercool!! I want it :-)

When a document is created the interceptor checks to see whether a
workflow already exists. It does this by retrieving the workflow ID from
a WebDAV property of the document. If it doesn't exist a new workflow is
created in the workflow store.


Interesting terminology you use here: let me ask you this before we get
confused: "workflow" is for you an instance of the model or the model
itself?

I use the same term for both the model and the instance :">

When our frontend retrieves the tree of documents, the interceptor will
retrieve the workflow for each document.


Seems to be the instance. Ok, careful though, because normally people
refer to workflow as the "model", not the instance.

I will be more explicit in further messages.

Looking at the role of the user
the interceptor will determine which actions are enabled. The enabled
actions (including their display text and activation URLs) are set in a
WebDAV property of the document.

For the generation of the pending task list we used the OSWorkflow query
API to generate the documents which are in the waiting-for-review state.
The approve and disapprove actions are passed to the frontend in the
same way as the commands for a writer.

Not all actions are directly shown in the menu, because the user invokes
them implicitly. The edit action for example is invoked by the
interceptor each time the user saves the document.

Issues
------
We encountered issues with both slides and OSWorkflow during the
implementation.

Before we used Slide, we used the Cocoon repository. The semantics of
the repository interceptors and the Slide interceptors is not the same.
With the repository interceptor we were able to add a property to the
document in postStoreContent(...). In Slide we had to do this in
preStoreContent(...).


IMHO, makes more sense to add metadata in pre-saving than in
post-saving. It's way more efficient for clustered environments.

I dont't care what's better. I just thought that two technologies used heavily in Cocoon having different semantics for the same concept was confusing.

Apart from that the Slide interceptors work very well, but (in the
version of Slide we used) they get called a lot. A single store of a
document invoked preStoreContent(...) and postStoreContent(...) multiple
times.


well, this is a bug then. there should be a way to connect to an atomic
event for a content store... you might want to bring this up on slide-dev

OK. I will look into this (making sure we don't add the same interceptor multiple times).

OSWorkflow performed great too. The only disadvantage was the complexity
of state machines that can be expressed. As you can see in the attached
diagram nested states are used. OSWorkflow does not support these.
The more I hear about workflows, the more I think that writing them with flow and continuations makes more sense than writing a finite state machine.

I don't like procedural code to handle complex state. You wind up with a lot of if-statements and it is difficult to determine what happens when a particular action gets invoked. A state machine has a lot of context: I am in state X, so all operations on this state and its parent states are valid. A state machine also hides a lot of implementation details. No need to check what the value of the current-state variable is.

Although the attached workflow does not contain parallel states, we
think it might be needed for some document types. A newsletter for
example follows the same workflow as the attached one. But parallel to
this is a mailing workflow for sending it to the newsletter subscribers.

In the mailing workflow the user can send a test email of the current
version to himself. When he is satisfied he can send the final version
to the newsletter subscribers. After this, he can neither send a test
email nor send it to the subscribers.

But what to do if a mistake in the newsletter is found after sending it
to the subscribers? The subscribers won't be happy to receive another
copy, so the mailing actions should stay blocked. But not correcting the
newsletter on the website looks sloppy. Therefore the
editing/reviewing/publishing workflow has to remain active.

this screams for long-lasting continuations!

How would you handle parallel states using continuations? If you want a unique continuation point for each possible combination of states, the number of continuations points will explode.

Workflow requirements
---------------------
Building an effective and solid workflow solution requires two
preconditions. Both are outside the scope of the workflow framework:
- understandable role assignment (from a user's perspective) and simple
role retrieval;
- typed document repository. This is necessary to enable different
document types having different workflows attached to them.

granted.

When these two preconditions are met, the workflow framework must meet
the following basic requirements:
- the ability to specify under what conditions an action can be invoked.
Authorization is considered a specific type of condition;

I think you mean "authentication" not "authorization".

No, I mean authorization. I assume the user is already authenticated. The conditions for actions (can) check that a person is authorized to invoke the action.

- the ability to retrieve the actions which can be invoked by a
particalur user at this moment;

yes

- the ability to query the workflow store for objects which are in a
specific state and are relevant to the current user.


I don't think the workflow engine should have anything to do with the
objects. As for basic SoC, the workflow engine should be a separate
entity, a rule engine.

The objects here are the workflow instances, but information from the objects they are related to is also necessary during queries (for example to check whether the person has a specific role for an object).

When you query the objects you combine information from two sources: the repository and the workflow store. This can be done in a couple of ways: - the workflow store has access to/contains information from the repository (special functions plugged in to the workflow); - the repository has access to/contains information from the worfklow store (WebDAV properties which are set when a state change occurs); - a facade with its own query interface.

The requirements on the main function of a workflow framework,
state-machine evaluation, depend on the complexity of the use cases
which need to be implemented.

Although we implemented the workflow in our demo using OSWorkflow we
were not completely satisfied. Some actions, edit and delete for
example, should be available in more than one state. We had to promote
these actions to global actions, and add conditions to these actions to
check whether the workflow is in a valid state. Because of this creative
coding the logic of the workflow moved out of its context and difficult
to read from the workflow configuration.

In our opinion a workflow framework supporting almost all constucts from
UML state machines is needed to be able to build powerful, and still
easy-to-understand solutions.


It would be killer to have a UML drawing tool that would generate
javascript with continuations!

Generating code would be great, but I think it is possible to have XML documents which clearly describe complex state machines. Then no code generation is needed, just a state-machine engine.

Where to go next
----------------
Our next goals are to make sure the two preconditions are met.
Concurrently we will be creating workflow use-cases to determine how
complex state machines need to be to implement these use cases.

We are very interested in hearing about use cases and workflow-framework
experiences from other people. We will update the existing page about
workflow on the Cocoon Wiki
(http://wiki.cocoondev.org/Wiki.jsp?page=Workflow) with our experiences.


I have no direct personal experience but I've been thinking about flow
and workflow a lot and I think the only difference is that flow is acted
by the same person while workflow is acted by more than one person.

If FSM work bad for flow, why would they work any better for workflow?

I must confess I am a state machine junkie, so my opinion is a bit biased towards them. For me the main advantage would be that a state machine (whether in diagram form or in XML form) would be easy to explain to other people.

--
Johan Stuyts

Re: Experience with workflow at Hippo Webworks

Reply via email to