SV: Flowmaps: the wrong approach

Daniel Fagerstrom Mon, 03 Dec 2001 23:27:22 -0800

Stefano Mazzocchi wrote:
> I've taken the weekend to learn Scheme, now I know what we are talking
> about :)
>
> Daniel Fagerstrom wrote:
> >
> > Ovidiu Predescu wrote:
> > <snip/>
> > > I now believe we should have a system centered around logic, not
> > > around states and transitions.
> > I agree completely, IMHO, writing FSM:s is like goto-programming, very
> > small systems are easy to understand, but as soon as they grow, they
> > easily become a maintainance nightmare.
>
> This is the old tune against GOTOs and I totally agree on that.
>
> On the other hand, I disagree that FSM equals goto-programming (in fact,
> you are describing FSM down below, using the XML syntax :)
Do you think so? In an FSM I can go from any state to any other state but in
the examples below I cannot go from a stage inside a loop to a stage before
the loop, e.g. Furthermore if we use a stack, (which is only needed if we
allow for general recursion or flowmaps that are seperatly compiled), that
gives more "power" than an FSM has. Or am I missing something?


> > > The logic should be expressed in a
> > > language that supports continuations. The logic should drive what
> > > pages are presented to the user. These pages could be expressed in an
> > > XML markup language with appropriate elements for extracting data
> > > previously created in the program. These XML pages could be then
> > > processed through a pipeline, similarly with how they are processed
> > > today in the sitemap. However since incoming URLs are handled directly
> > > by the logic, there's no need for matchers in the sitemap.
> > >
> > >
> > >  HTTP request                         transformations
> > > --------------> logic -----> XML page ----------------->
> HTML/WML/... page
> > >
> > >
> > > The generated pages contain URLs that point back to continuation
> > > points in the logic.
> > >
> > > The biggest problem is the fact that the logic needs to be expressed
> > > in language that supports continuations. Since most people don't like
> > > the Lisp syntax, a language that abstracts out the continuations in
> > > higher level abstractions like send-response could be developed. This
> > > can probably be done by extending a familiar language, like
> > > Javascript, with these concepts.
> > <snip/>
> > > This is a huge paradigm shift from what we have right now, but I
> > > believe leads to easier ways to write Web applications. They become
> > > more like usual programs, instead of the complex beasts that we have
> > > today, with state scattered all over the place in the code.
> > >
> > > And yes, please read the papers I pointed to in my previous email, to
> > > understand what the heck I'm talking about. Here they are for your
> > > convenience:
> > >
> > >    http://youpou.lip6.fr/queinnec/Papers/webcont.ps.gz
> > >    http://www.cs.rice.edu/CS/PLT/Publications/esop2001-gkvf.ps.gz
> >
> > I happen to like Lisp as well at its syntax ;)
>
> Gosh, can't say the same, but it's not important at the moment.
>
> > Still I wonder if it
> > would not be possible to continue in the great cocoon tradition of
> > SoC, and find a convenient description of webapp flow, without going
> > all the way to a full high level programming language.
>
> Bingo! that's the point. The paper wants to "get control back" since
> page-directed programming stole the control from the programmer.
>
> The paper is right on many things but forgets about taking SoC into
> consideration.
>
> Turning a sitemap into a logic-oriented description would be equally bad
> since web programming is a mixed form of "declarative" (page-oriented)
> and "procedural" (logic-driven).
>
> My personal opinion is that we should have both at the same time in
> order to keep Cocoon elegance.
>
> > Now that you
> > have succeeded in exorcizing the mix of programming language
> > constructs and tags from JSP etc, it seem like a pity to let this mix
> > in again.
>
> Exactly!
>
> > Anyhow, inspired of the interesting discussion and the articles that
> > you referd to, I started to think about how to use these concepts in
> > cocoon. If possible, without having to do a "huge paradigm shift".
>
> Same here!
>
> > --------------------------------------------------------------------
> > Flowmap
> > -------
> >
> > First, to make it more concrete, I will try to express the main
> > example from, http://youpou.lip6.fr/queinnec/Papers/webcont.ps.gz, in
> > terms of xslt, cocoon components, a sitemap and a flowmap. For those
> > of you that not have read that article yet, the main example is a
> > small webapp:
> >
> > 1. On the first page it ask for the conversion rate between French
> > Francs and another currency.
> > 2. Then it ask for an amount of Francs.
> > 3. And on the third page it returns the result.
> >
> > One of the coolest things about the implementation in the article, is
> > that it can take care of multiple questions at once. If you browse
> > through the three steps above, and then click on the "new window"
> > button in your browser. You can the go back to the first or second
> > screen, and fill in new data, without affecting the result in the
> > other browser, (even if you use the refresh button in it). This
> > behavior is very useful for "what if" kind of questions, where one can
> > evaluate several alternative scenarios in a convenient way.
>
> Yes, this the really cool thing about their thesis of binding resources
> to program continuations.
>
> > Ok, here we go!
> > We start with a high level description of the application flow:
> >
> > <!-- flowMap.xml -->
> > <fm:flowMap xmlns:fm="...">
> >   <fm:flow url="conversion">
> >     <fm:until test="/in/exchange/rate &gt; 0" id="rateTest">
> >       <fm:show src="cocoon://readRate.html" id="rate"/>
> >     </fm:until>
> >     <fm:show src="cocoon://readFrancs.html" id="francs"/>
> >     <fm:show src="cocoon://result.html" id="result"/>
> >   </fm:flow>
> > </fm:flowMap>
> >
> > (The "id" attributes are not necesarry and are only used for making
> > reference easier) The flow map is either a part of the sitemap or
> > mounted from it. It will be executed by a "flowmap engine" on a
> > request for "cocoon://conversion". The children of "fm:flow" are
> > executed in sequence. Each child works as a pipeline. The flowmap
> > engine feeds the pipeline with an xml-document, that has "in" as root
> > element. This document contains two parts, a continuation, that is an
> > url to the next stage (or stages) in the flowmap, and description of
> > the current state.
> >
> > The input to the first stage, "conversion#rateTest" could look like
> > this:
> >
> > <in/>
> >
> > Now, the first stage is an until-statement (a mistake from a
> > pedagogical point of view, I realize :) ), the test - an XPath
> > expression, will obviously not succeed on the current input
> > data. Therefore the body of the until-statement is executed. But
> > before we can do that we have to set the continuation.
>
> I like very much what I see and I think you are onto something,
Thank you:)

> but
> there is a problem: the iterative step will fail to provide feedback on
> the error.
>
> We *must* take into consideration try/fail by providing the ability to
> update the form page if some data inserting error is made. This is vital
> for webapp usability.
Yes, I agree completely, error handling is _very_ important. I have been
thinking a lot about handling of data insertion errors, and were about to
throw an RT on it. I didn't, however, think about it from a "flowmap
perspective" before, so I am still not clear about how I would like to
integrate it.

Anyway, here are the main lines:

* The webapp gets input from form fields, an uppload or a soap like call
(are there more possiblitys?). The input data consists of strings in
parameters, a document containing some (hopefully) structured data, (e.g.
tab separated numbers), that we have decided to accept as input, or if we
are realy lucky, XML-data.

* This data should, rather sooner than later, be transformed to XML. After
all we are in the _XML_-weapp buisiness :) This seem to be an obvious task
for a generator. (In this context I actually prefer the term deserializer,
or something similar, as you mentioned, althoug not advocated, in your
original post in the "Data goes in data goes out" thread).

* Now that the input data are in xml-form we could, or IMHO should, have an
XML-schema (or your favorit schema language) to validate the input data
against. We don't want to put data in the wrong format in our data base or
in our java programs, do we?

* So what is the result of the validation? There seem to be three main
cases:
+ The input is valid - let the it flow to the next step in the pipe.
+ The input has invalid structure - this means that we have a fatal input
error. An error that it is hard to recover from or to report anything
sensible from. If we have designed the client side, a structural error means
that we have a bug in our system or that someone try to post data whithout
using our client. In booth cases we can not do much more than logging what
is relevant, and report a fatal error to the user. If we offer a webservice
or allow for uploads we probably have to work a litle bit harder on our
feedback to the user.
+ The input has valid structure but invalid data types in the text fields or
in the attributes. This is the case you asked for above. This case is more
complicated, we have to give the user detaild feedback on whats wrong and a
possibilty to update the faulty data fields. Two possible ways for the
validator to inform the rest of the system about what went wrong, are:
A list of location path, error message pairs. This can describe all kinds of
field errors, but it is not obvious to me how to make use of the
information. Another possibility is to only allow user input within elements
and not in attributes, in this case the input xml can be annotated with
error attributes in the faulty elements.

I think that the validator should be a transformer, it takes xml as input
and, except for fatal errors, emit xml. It could be a part of the
generators, but thats leads probably to overly complex design of the
generators. Xerces2 is actually build as a pipline with plugable components
(not Avalon components however), where the pipline can consist of components
like a scanner, a DTD validator, an XNL-Schema validator etc, and where the
pipeline components components communicate with XNI events, that are like
SAX-events but somwhat more low level. After having browsed the relevant
parts of the Xerces2 source code I belive that it should be possible to
reuse some of the components to build a "error annotating" validator, but I
am not completely certain yet.

* More complicated validation that check e.g. dependences between fields
could be based on the "bind" elements from XForms and be put in another
transformer.

So, now comes the crux: the steps this far seems to be quite naturally
described in terms of a pipeline. But now we have to make a choice on where
to pipe the reults, if the validation succeded the results should be piped
to the "DoIt"-transformer and if we got field errors the results should be
piped to a "partly filled in form with error messages"-transformer. This can
defentively not be done in the same pipeline.

AFAIK, but I can have missed something important, if you want to build
something like what I outlined above in Cocoon today you have to use a
number of actions instead of generators and transformers. That obscures the
pipline aspect of serialization and validation. As an example, the
"StreamGenerator" would be usefull as a part of an XForms handeling pipe,
but if I decide to validate, I would have to build a "StreamAction" that
places its result in the model or in the session in some place that the
"ValidatorAction" has to know about.

So what would I like to have instead, (and appologize if I have missed
important aspects of what you can do with whats currently available in
Cocoon), is something like the following:

* An "input pipline", like the one I described above, that is required to be
side effect free and only dependent on input. The output of the
input-pipline is stored in a datastructure (a DOM-tree I guess). Ok, that
hurts, but I cannot see any choice, we can only know about the result of a
validation after having seen all the data, and till then we have to store it
somewhere.

* Now we can have flow control, (a selector maybe?) based on the result of
the validation, and also things like XPath-expression aplied on the input,
and on the global state of the system.

* Based on the selection in the flow control, the input structure is
unpacked to SAX-events again in the choosen pipeline. And for this step we
have a "real" pipeline again that is free to perform any side effects it
want to.

I think that the outlined concepts should integrate well with a continuation
based flowmap engine, but I start to get to tired to be able to explain how.

Ok, i seem to have written my RT anyway, it was not my intension :)

>
> > The next stage
> > after "conversion#rate" is "conversion#rateTest". We represent this
> > situation by creating the new input:
> >
> > <in>
> >   <flow>
> >     <next>conversion?next=rateTest-23454</next>
> >   </flow>
> > </in>
> >
> > Here the url "conversion?next=rateTest-23454" consists of two parts, one
> > that identifies the next stage to go to in the flowmap and one "23454"
> > that uniqely identifies the current state, which this far happens to
> > be empty. The current state is stored in a hash table with the url as
> > a key.
> >
> > We need an implementation of "conversion#rate":
> >
> > <!-- readRate.xsl -->
> > <html xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
> >   <head>
> >     <title>Conversion</title>
> >   </head>
> >   <body>
> >     <h1>Conversion from Francs</h1>
> >     <form action="{/in/flow/next}" method="post">
> >       <label for="/exchange/rate">rate</label>
> >       <input name="/exchange/rate" type="text" size="10"/>
> >
> >       <label for="/exchange/currency">currency</label>
> >       <input name="/exchange/currency" type="text" size="10"/>
> >
> >       <input type="submit" name="submit" value="Continue"/>
> >     </form>
> >   </body>
> > </html>
> >
> > The main things to notice here is that, the form/@action will be
> > replaced with the current continuation from the input, in our case
> > "conversion?next=rateTest-23454", and that the names in the form
> > describes positions in a output xml-document, (this idea is taken from
> > the XForms draft).
>
> Hmmm, as a personal taste, I'd rather pass the continuation hashcode as
> a hidden parameter of the form, so that it doesn't "pollute" the URI. Of
> course, we can't let the user take care of this so we must come out with
> something for this.
Agreed, I thought that there was a need for one hashcode for each
continuation on a page with multiple links, but they are anyhow
distinguished by their URI:s, and they are even connected to the same state.

To clearify:
* For state less pages, we don't need any hashcode.
* For non copyable continuations, the hashcode will, AFAIK, correspond to
the session id, so we can let the session handling system take care of it
instead.
* For copyable continuations we need a new hashcode for each page.

> What about using XForms directly and provide our own transformations to
> HTML forms that take care of everything? (they could even add
> client-side javascript validation code)
Yes, absolutely, I have written such a system together with a colegue. It is
based on a small subset of an early draft of XForms. From the
XForms-document we create an XSLT-document (actually by using another
XSLT-document :) ) that generates a populated HTML form when it get
XML-instance data as input. It also it also uses the kind of error
attributes that I mentioned in my "embeded RT" above.

I hope that I will find time to refine and update our things so that I can
submit them.

>
> > We also need a sitemap fragment to see how readRate.xsl is suposed to
> > be called:
> >
> > <!-- sitemap.xmap -->
> > <map:sitemap xmlns:map="http://apache.org/cocoon/sitemap/1.0";>
> >   <map:pipelines>
> >
> >     <map:pipeline>
> >       <map:match pattern="**.html">
> >         <map:generate type="flowMapGenerator"/>
> >         <map:transform src="{1}.xsl"/>
> >         <map:serialize/>
> >       </map:match>
> >     </map:pipeline>
> >
> >   </map:pipelines>
> > </map:sitemap>
>
> There more I think about it, the more I get the perception that instead
> of coming up with something anew, we should enhance the sitemap
> semantics to consider flows.
>
> But it's something I still can't picture :/
>
> > Here the "flowMapGenerator" feeds the current input to
> > e.g. readRate.xsl.
> >
> > More interesting things happens when the user have filled in the form
> > and hits the submit button, this will create a request for
> > "conversion?next=rateTest-23454", and the flowmap-engine will respond
> > in the following manner:
> >
> > 1. Read the request parameters, in our case they might be:
> > /exchange/rate=1.4551&/exchange/currency=SEK.
> >
> > 2. Create an XML-document from the request parameters:
> > <exchange>
> >   <rate>1.4551</rate>
> >   <currency>SEK</currency>
> > </exchange>
> >
> > 3. Resume the state that is associated with the
> >    url, from the hashtable. It happens to be empty at this moment.
> >
> > 4. Combine the restored state with the current input. This can and
> >    needs to be done in many different ways, but for our current
> >    example, an insert/replace operation, is enough, and results in:
> > <in>
> >   <exchange>
> >     <rate>1.4551</rate>
> >     <currency>SEK</currency>
> >   </exchange>
> > </in>
> >
> > 5. And this is the new input to "conversion#rateTest", this time the
> >    test will succeed, and as a result, the flowmap engine continues to
> >    the next stage "conversion#francs", and sets the continuation to
> >    the stage after:
> > <in>
> >   <exchange>
> >     <rate>1.4551</rate>
> >     <currency>SEK</currency>
> >   </exchange>
> >   <flow>
> >     <next>conversion?next=result-54328</next>
> >   </flow>
> > </in>
> >
> > So, here I will stop boring you with all the details. The last two
> > pages look as follows:
> >
> > <!-- readFrancs.xsl -->
> > <html xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
> >   <head>
> >     <title>How many Francs?</title>
> >   </head>
> >   <body>
> >     <h1>Converting into <xsl:value-of
> > select="{/in/exchange/currency}"/></h1>
> >     <form action="{/in/flow/next}" method="post">
> >       <label for="/FRF">Francs</label>
> >       <input name="/FRF" type="text" size="10"/>
> >
> >       <input type="submit" name="submit" value="Continue"/>
> >     </form>
> >   </body>
> > </html>
> >
> > <!-- result.xsl -->
> > <html xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
> >   <head>
> >     <title>Conversion result</title>
> >   </head>
> >   <body>
> >     <h1>Conversion result</h1>
> >     <p>
> >       <xsl:value-of
> >         select="concat('If 1 FRF corresponds to ',/in/exchange/rate,' ',
> >                        /in/exchange/currency,' then ',/in/FRF,
> >                        ' FRF correspond to ',/in/exchange/rate
> * /in/FRF,
> >                        ' ',/in/exchange/currency,'.')"/>
> >     </p>
> >   </body>
> > </html>
> >
> > result.xsl will be called with input like this:
> >
> > <in>
> >   <exchange>
> >     <rate>1.4551</rate>
> >     <currency>SEK</currency>
> >   </exchange>
> >   <FRF>100</FRF>
> >   <flow/>
> > </in>
>
> I see value in what you explain, but the use of XSLT an variable
> expansion language is, IMO, a little bit overkill since no
> transformation is taking place.
>
> What do you think about Velocity instead?
Yes, Velocity is probably better for this kind of things, at least if it can
use XPaths, (I have just browsed the documentation so I don't know much
about it). For my own case I happen to be an XSLT-fanatic and uses it for
all kinds of stuff where other languages might be a better choice ;)


>
> > ------------------------------------------------------------------------
> > More constructions
> > ------------------
> >
> > There are certainly need for more language constructions to make the
> > flowmap usable, some examples:
> >
> > <fm:if test="XPath">
> >   Do something
> > </fm:if>
> >
> > A possiblity to have several possible continuations:
> >
> > <fm:switch>
> >   <fm:case link="l1" id="i1">
> >     Do something
> >   </fm:case>
> >   <fm:case link="l2" id="i2">
> >     Do something
> >   </fm:case>
> >   ...
> >   <fm:case link="ln" id="in">
> >     Do something
> >   </fm:case>
> > </fm:switch>
> >
> > The switch statement will give the preceding statement the input:
> > <in>
> >   <flow>
> >     <next>flow?l1=i1-76456</next>
> >   </flow>
> >   <flow>
> >     <next>flow?l2=i2-09877</next>
> >   </flow>
> >   ...
> >   <flow>
> >     <next>flow?ln=in-65433</next>
> >   </flow>
> > </in>
> >
> > Maybe the switch statement should be nestable with if statements, to
> > make it possible to describe that some of the links only are
> > available if certain conditions are fullfiled. An important example is
> > to only show the links that one is allowed to traverse.
> >
> > It is useful to call other flows:
> > <fm:call src="cocoon://flow1"/>
> >
> > To make flow calls possible one need to store a stack of
> > continuations from the calling flows, in the state.
> >
> > Some kind of try, catch statement would probably simplify error
> > handling.
>
> I'm not that sure.
>
> > -------------------------------------------------------------------
> > State handling
> > ---------------
> >
> > The state handling described above, is to primitive for many
> > situations. It allows for the "what if"-scenarios mentioned in the
> > beginning, (I guess that is far from obvious from what I have said,
> > but the images in the beginning of the referred article explains the
> > situation quite well). This flexibility comes with a high cost, each
> > continuation, that is created is associated with an own copy of the
> > state. As long as the state is read-only all the copies can have
> > references to common parts, and thus take away most of the copying,
> > still the approach requires a lot of resources. Another problem is
> > garbage handling, when should an unaccesed continuation, be taken
> > away? (some ideas can be found in the referred articles).
> >
> > In situations where one updates a data source with a large state, a
> > data base, for example, a "many world"-behavior is not desirable at
> > all. It would mean that the system have to handle several copies of
> > the database, or that the database must be able to take care of
> > multiple branches of the stored information.
> >
> > This situation can be handled by restricting the creation of new
> > continuations, so that one copy of a continuation is allowed for each
> > stage in the flow.
>
> Yes. Even if it is cool to using continuations to avoid the need to
> check for back and cloning, I see very little value in letting the user
> clone the window without having finished the previous flow.
>
> I see no problem in forbidding this by restricting the creation of a
> single continuation.
I got carried away of the article, cloned pages are defenitely not the main
use case for flowmaps, although we actually use it, (but based on other
methods), in a datawarehose application on the company that I work for.

>
> Anyway, very good food for thoughs, indeed.
Thanks for your inisightfull comments.

/Daniel



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]

SV: Flowmaps: the wrong approach

Reply via email to