[RT] Forms and wizards (was: RE: HEADS UP - cocoon form handling (long!!))

Daniel Fagerstrom Sun, 14 Apr 2002 14:38:30 -0700

Berin Loritsch wrote:
> Ivelin Ivanov wrote:
<snip/>
> > Would you scetch an example of a non-trivial app which does not use
> > JavaBeans?
<snip/>
> I know I am side-stepping your question, however I have not run into
> a situation where I needed beans at all.  So your assumption that every
> non-trivial application needs beans is not valid.
<snip/>
> What you invariably run into in the Cocoon world if you use beans is
> a double mapping: DB to bean, and bean to schema.  That approach does
> not scale well at all.  Not to meantion there are unnecessary
> conversions that can be the source of problems.  KISS (Keep It
> Simple Stupid).
>
> Keep your memory lean.  Beans don't let that happen--or they force
> you to be too smart for your own good.


I find Ivelins and Torstens work on form handling very promising, but I
share Berins (and some other commenter), concern that they are maybe trying
to find answers to some unnecessarily complicated problems.

I'll discuss form handling an multipage form handling (wizards), and try to
give some proposals on how to _not_ solve some of open questions and issues
discussed in
http://marc.theaimsgroup.com/?l=xml-cocoon-dev&m=101834333817122&w=2. While
at it, I'll also share some provocative thoughts about continuations and
show some heavy use of the hammer anti pattern ;)

One page forms
--------------

We start with a simple one page form. Say that we want to collect some data
about a user of our system. Typically we want the form to be partially
filled in with some default data or something that the user already have
filled in. So we start with something like:

<map:match pattern="userForm.html">
  <map:generate src="userDefault.xml"/>
  <map:transform src="userForm.xsl"/>
  <map:serialize/>
</map:match>

Here the input data is xml (I guess I don't have to argue about why that is
a good idea ;) ), it might be directly from a file, from a db, from
JavaBeans or whatever you might prefer. "userForm.xsl" is a simple
stylesheet that creates a partially filled in html form from its input, (you
can use Velocity or some other template generator instead if you like). To
make the connection between the xml input data and the form field names
simple we use xpaths as field names as is done in XForms (Ivelin, Torsten
and Konstantin do the same). The form stylesheet will have fields like this:

<input type="text" name="/user/name" value="{/user/name}"/>

As these form stylesheets are rather boring to write, we let our computer do
the work:

<map:match pattern="*Form.xsl">
  <map:generate src="{1}Form.xform"/>
  <map:transform src="xForm2xsl.xsl"/>
  <map:serialize type="xml"/>
</map:match>

We can describe our form in terms of some appropriate subset of the form
control part of XForms or maybe in some home brewed form describing
language. "xForm2xsl.xsl" is a stylesheet that takes an Xform description as
input and creates another stylesheet that in turns takes the default data as
input and creates an html form. Konstantin have submitted something similar:
http://marc.theaimsgroup.com/?l=xml-cocoon-dev&m=101429833513052&w=2 a while
ago. He stays closer to the XForms standard, which contains the instance as
a part of the xforms document, IMHO the XForms standard mixes concerns by
doing so. Anyway by follow the XForms way of doing things Konstantin don't
have to write xslt generating xslt, but have instead to use some extension
functions (evaluate).

Ok, now we can use our automatically generated form stylesheet:

<map:match pattern="userForm.html">
  <map:generate src="userDefault.xml"/>
  <map:transform src="cocoon:/userForm.xsl?continue=userPost.html"/>
  <map:serialize/>
</map:match>

We also use a "poor mans continuation" trick, by sending the url that the
form is supposed to post to, as a request parameter to the form stylesheet
generator. This makes the page flow explicit in the sitemap as it should be.

Time to take care of the data posted from the form:

Input handling
--------------

Even if we would prefer xml data in the post, it will probably take some
time before that is the usual case. By using xpaths as field names we have
at least an implicit description of an xml document. So what we need to do
is the following:
* Create an xml document from the request parameters
* Validate the input data
  Invalid -> Resend the partially filled in form with error messages
* Store or use the data
* Send a success page or the next form

This can look like:

<map:match pattern="userPost.html">
  <map:generate type="xrequest"/>
  <map:transform type="schematron" src="user.sch"/>
  <map:match type="pipe-state" test="fail">
    <map:transform src="cocoon:/userForm.xsl?continue=userPost.html"/>
    <map:serialize/>
  </map:match>
  <map:transform type="write" method="overwrite" src="dbxml:/user"/>
  <map:transform type="read" src="success.html"/>
  <map:serialize/>
</map:match>

* The "xrequest" generator builds an input xml document from the (xpath)
request-parameters.
* The "schematron" transformer validates the input data, set some
"pipe-state" request attribute to "success" or "fail" and give some kind of
report about the errors.
* The "pipe-state" matcher is pipe state aware (cf. the recent discussions
about pipe aware selectors), i.e. it depends on the state of the
"pipe-state" request parameter as it is _after_ the execution of the
validator.
* If the validation failed the input is piped to the form stylesheet and
sent back to the user. The form stylesheet also have rules for rendering the
error report.
* If the validation succeeds the data is sent to a store of some kind.
* A success report is generated.

<design-issues>
In the above example collection of the data and storing of the data
(population of the instance) are separated while I&T combine them. There are
so many possible storages for instance data e.g.: beans and dom in session
and request attributes, files, relational db:s, xml db:s, business objects
and so on, that it seem overwhelming to create a common "instance" interface
for them all, better just put the data in the pipe and let the storage of
it, be someone others concern.

I&T also discuss the indirect vs. direct population problem and proposes to
use direct population of instances (cf. the link above for details). The
example above uses the indirect approach, but could easily be made direct by
giving a template instance as a "src" parameter. We have used designs like
the one above for nearly a year in the company I work for and have never had
any problems with indirect population, IMHO this is a concern for the
storage model.

I think the result of our recent "pipe-awareness" discussion is that the
success or failure of a validation transformer should be put in some "meta"
parameter, probably in a request attribute. We still need to find a good way
to report the details of the errors. One possibilities is: having a separate
result document with pairs of xpaths and error messages, (that explains what
was wrong at that position). Another: annotating the document with error
attributes at the faulty elements, e.g.

<foo>
  ...
   <bar err:error="not a number">qwerty</bar>
</foo>

The first variant is more general and elegant, and the second is much more
easy to use as input for a stylesheet. I prefer the second one :)

The "write" transformer is mainly a thought experiment, it is like tee in
unix. It does the same thing as the WritableSourceTransformer but the source
name is in the sitemap instead than in the input document. I placed the
"method" attribute in it to indicate that if we want to make things like xml
db:s writable sources, we have to find a way to describe what kind of update
method we want to use.

The "read" transformer doesn't care about its input and just puts the
content of its src attribute in the pipeline.

In many cases, there is no already written source or writable source for our
storage and we have to work a little bit harder, e.g. by writing own
mappings between e.g. xml and relational db:s or xml and java beans. There
are some tools that can create at least part of the mapping from some
scheme, e.g. Castor and XML-DBMS http://www.rpbourret.com/xmldbms/.
</design-issues>

Wizards
-------

Here we define multi page forms (or wizards) as building one xml document
from several form pages.

I&T suggests that the structure of the instance and the views should be as
decoupled as possible:

>       [------------instance-----------]
>HTML: [-----view1-----][-----view2----]
>WML:  [--view1--][--view2--][--view3--]

and also that validation could be done at still other substructures that not
necessarily are connected to the views (cf their paragraph about views and
phases).

IMO one can simplify the problem considerably by deciding that the view
always are non-overlapping sub trees of the instance, and that we have one
scheme for each view. I know that this is not a completely generic solution,
but after all we probably did some modeling when we designed our instance.
The sub trees probably describes conceptually different areas, so hopefully
we can reuse this thinking by couple the views to these different areas. If
on the other hand the instance is based on a lousy model, why don't build a
new one for our wizard, then we can always use xslt to transform our view
model to the instance model.

Time for some code:

<map:resource name="form">
  <map:transform src="cocoon:/{query}Form.xsl?continue={query}Post.html"/>
  <map:serialize/>
</map:resource>

<map:resource name="filled-form">
  <map:transform type="read" src="session:/{wizard}/{query}"/>
  <map:call name="form">
    <map:parameter name="query" value="{query}"/>
  </map:call>
</map:resource>

<map:resource name="store">
  <map:generate type="xrequest"/>
  <map:transform type="schematron" src="{query}.sch"/>
  <map:match type="pipe-state" test="fail">
    <map:call name="form">
      <map:parameter name="query" value="{query}"/>
    </map:call>
  </map:match>
  <map:transform type="write" src="session:/{wizard}/{query}"/>
</map:match>

Here I mainly restate our earlier code in terms of parameterized resources,
so that I don't have to repeat myself all the time. The "wizard" parameter
is for the root element of the xml document, and query is for the sub
documents and views. We store everything in a xml document in a session
attribute while filling in the queries in the wizard session.

As an example we do a simple survey for cocoon users:

<map:match pattern="cocoonSurvey/**">
  <map:parameter name="wizard" value="cocoonSurvey"/>

  <map:match pattern="**/start.html">
    <map:act type="write">
      <map:parameter name="from" src="dbxml:/cocoonSurvey[@id='default']"/>
      <map:parameter name="to" src="session:/cocoonSurvey"/>
    </map:act>
    <map:call name="filled-form">
      <map:parameter name="query" value="personal"/>
    </map:call>
  </map:match>

  <map:match pattern="**/personalPost.html">
    <map:call name="store">
      <map:parameter name="query" value="personal"/>
    </map:call>
    <map:call name="filled-form">
      <map:parameter name="query" value="system"/>
    </map:call>
  </map:match>

  <map:match pattern="**/systemPost.html">
    <map:call name="store">
      <map:parameter name="query" value="system"/>
    </map:call>
    <map:select type="xpath">
      <map:when test="/system/platform[.='linux']">
        <map:call name="filled-form">
          <map:parameter name="query" value="linuxDetails"/>
        </map:call>
      </map:when>
      <map:when test="/system/platform[.='windows']">
        <map:call name="filled-form">
          <map:parameter name="query" value="windowsDetails"/>
        </map:call>
      </map:when>
      <!-- ... -->
    </map:select>
  </map:match>

  <map:match pattern="**/linuxDetailsPost.html">
  <!-- ... -->
  </map:match>

  <map:match pattern="**/windowsDetailsPost.html">
  <!-- ... -->
  </map:match>

  <map:match pattern="**/lastPost.html">
    <map:call name="store">
      <map:parameter name="query" value="last"/>
    </map:call>
    <map:act type="write">
      <map:parameter name="from" src="session:/cocoonSurvey"/>
      <map:parameter name="to" src="dbxml:/cocoonSurvey[@id='1234']"/>
    </map:act>
    <map:transform type="read" src="success.html"/>
    <map:serialize/>
  </map:match>
</map:match>

In this example we fill the xml structure that we put in the session, with
default data from a db in the beginning, and store the result in a db in the
end. We use a pipe-aware selector to choose between several paths in the
wizard, depending on the last answer, (we could achieve even larger
flexibility by making choices based on xpath expressions applied on the
session data). Note that the flow control is based on the same continuation
trick as we used in the "one page form" example. The value of the "continue"
parameter is used for the submit button. Also note that we can use the back
and forward button of our browser as much as we want. As soon as you push
the submit button and your data is valid, the data for that page is stored,
and as soon you push the refresh button on a form page, its current content
will be filled in.

A problem is that if one first say that one is a linux user and submit the
linux data, and the go back and say that one is a windows user, and fill in
the windows data, booth the windows and the linux data continue to be
stored. To handle this problem further mechanisms are needed.

<design-issues>
>From a component point of view there is not much new in the wizard compared
to the one page form. We use a writeable session source that is mainly a
non-existing repackaging of functionality already in the
SunShineTransformer, we could have used that instead, and the same applies
to the "write" action. We also use a pipe-content-aware selector besides the
pipe-state-aware selector.

As a consequence of that we always store a page before showing a new the
pages will get misleading names, booth the linuxDetails and the
widowsDetails form page will have the url: systemPost.html. This can be
handled by having obscure names like large numbers so that no one notices,
or by using redirect, which is considered bad. Are there other alternatives
in the http protocol? Something like an internal redirect?
</design-issues>

Continuations
-------------

Why do I keep using the term "continuation" for the trick of sending the
address of the next page as an argument to the current generated page? Isn't
a continuation an object that contains the whole current state of the
program? Actually booth descriptions are true, it all depends on what's in
your program language.

The sitemap together with the continue parameter is a program language
although a quite small one. It contains selection: select and match, and it
contains global variables: session and request attributes, writable
resources and so on, and by using the continuation parameter we introduces a
goto construction. Thus: a small programming language. In such a small
language a continuation is just a program pointer - in the sitemap case: an
url. The control structures in structured programming: sequence, selection
and repetitions can be translated to goto selection (and the other way
around). So it would be rather easy to translate the structured programming
concepts mentioned to a sitemap as the one above.

If we extend our language with (possibly recursive) functions, we need to
take care of a call stack of program pointers also. This stack of uri:s
could be stored in a continuation object or in a hidden form field. If we
introduce local variables in our language we need to put these in the
continuation object as well. Local variables introduces the possibility for
"what if" scenarios, i.e. that you can have several independent instances of
the same form page. For some kind of webapps: e.g. shopping carts and
checkout sequences, this kind of behavior is IMHO harmful, (cf discussion
between Ovidiu and me:
http://marc.theaimsgroup.com/?l=xml-cocoon-dev&m=101856443021128&w=2,
http://marc.theaimsgroup.com/?l=xml-cocoon-dev&m=101858926504890&w=2,
http://marc.theaimsgroup.com/?l=xml-cocoon-dev&m=101861458122598&w=2,
http://marc.theaimsgroup.com/?l=xml-cocoon-dev&m=101870096719351&w=2). I
would propose that in a large class of webapps you don't need more than the
continue parameter described above. Another aspect is of course that the
sitemap with continuation parameter sucks as a web application programming
language. So you would anyway need a better language for describing
complicated flow, but I still wonder if something as powerful as
continuations with local variables is needed.

Using the hammer anti-pattern
-----------------------------

After all the recent discussion about what you not are supposed to do in the
sitemap I can not help to feel like provoking a little ;)

Principle: "Everything can and should be done in the sitemap" ;)

As we saw in the examples above we have always to combine the handling of
the input of one form with the construction of the next form. This is
because we need to have a match statement that surrounds the two. It would
be more natural to combine the form generation with its input handler:

<map:resource name="form-handling">
  <map:call name="filled-form">
    <map:parameter name="query" value="{query}"/>
  </map:call>
  <fm:label name="{query}Post.html"/>
  <map:call name="store">
    <map:parameter name="query" value="{query}"/>
  </map:call>
</map:resource>

Here some kind of environment is supposed to detect that a serialize is
followed of a generate, and make a continuation available. "fm:label" is a
way to give a name to the continuation if one don't want an automatic one.

Here is our example again:

<map:match pattern="cocoonSurvey/**">
  <map:parameter name="wizard" value="cocoonSurvey"/>

  <fm:sequence uri-prefix="cocoonSurvey">

    <fm:label name="start.html"/>
    <map:act type="write">
      <map:parameter name="from" src="dbxml:/cocoonSurvey[@id='default']"/>
      <map:parameter name="to" src="session:/cocoonSurvey"/>
    </map:act>

    <map:call name="form-handling">
      <map:parameter name="query" value="personal"/>
    </map:call>

    <map:call name="form-handling">
      <map:parameter name="query" value="system"/>
    </map:call>

    <fm:select type="xpath">
      <fm:when test="/system/platform[.='linux']">
        <map:call name="form-handling">
          <map:parameter name="query" value="linuxDetails"/>
        </map:call>
      </fm:when>
      <fm:when test="/system/platform[.='windows']">
        <map:call name="filled-form">
          <map:parameter name="query" value="windowsDetails"/>
        </map:call>
      </fm:when>
      <!-- ... -->
    </fm:select>

    <!-- ... -->

    <map:act type="write">
      <map:parameter name="from" src="session:/cocoonSurvey"/>
      <map:parameter name="to" src="dbxml:/cocoonSurvey[@id='1234']"/>
    </map:act>

    <map:transform type="read" src="success.html"/>
    <map:serialize/>

  </fm:sequence>
</map:match>

The new construction "fm:sequence" handles the url as "map:mount" it also
automates continuation handling by analyzing its content pipeline components
and making an url to each generator be available in a continuation parameter
for the components earlier in the pipe up to the next generator.

Implementation
--------------

If we exclude the last section what do we need to implement if we would like
to have what I describe above? Not much actually, we need the "xrequest"
generator, a validation transformer and pipe aware selectors. It could be
nice to have an "xForm2xsl.xsl" stylesheet also, but it is not necessary,
one can write form stylesheets manually also. The read and write
transformers and the new writable sources are not necessary at all there are
already transformers that do the same work. But maybe not as smooth in the
proposed framework.

Torsten have already implemented most of the functionality of an "xrequest"
generator but as an action, it would be fairly easy to repackage it as a
generator, I have an implementation that I can donate "as is", although it
would need some more polishing.  Torsten and Ivelin have also implemented
some different variants of validation transformers. And I contributed a
prototype implementation of pipe-aware selection some while ago. Booth would
need some small adjustments to work together as described above.

The great unsolved problems
---------------------------

There are probably tons of unsolved problems, but two particularly tricky
are:
* Order restrictions: E.g. A user cannot go back after having committed a
certain page.
* There might be several paths from the first to the last page in the
wizard, only the data submitted in pages along the path that was chosed in
the end should be stored in the end. An instance of this problem was
described in the end of the "wizard" section.

I have no solution to these problems, but I think that they become easier to
solve if there is a simple and explicit connection, between forms and
submitted data.


That's more than enough ;)

What do you think?

/Daniel Fagerstrom



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]

[RT] Forms and wizards (was: RE: HEADS UP - cocoon form handling (long!!))

Reply via email to