The Cocoon sitemap concept that is currently implemented is more than
two years old. Not much, but older than some W3C recommendations for the
XML model.

The sitemap was developped in an incremental way over these two years.
It started as a draft and things were added/moved/patched as they were
used, implemented and challenged with real-life operation.

I think it's now time to go back and summarize all this.

DISCLAIMER: the words below *DO* *NOT* want to be a description of the
*actual* behavior of the sitemap implementation, but a description of
what I find good design. I'd like to discuss over this document before
attacking the implementation issues and, more important, the eventual
back incompatile changes.

IMPORTANT: back compatibility will be judged more important than design
elegance at this time, although I'll make a great effort to maximize
both.

                                  - o -

A sitemap is a description of the processing associated with handling a
partition of the request space.

<note>Note that 'request space' is more than 'URI space' since the same
URIs could belong to different sitemaps depending on some other request
or environment parameters.</note>

A sitemap contains:

 - components description
 - resource views
 - pipeline resources
 - action sets
 - pipelines

<note>I'll concentrate on the 'pipelines' and 'views' sections which are
the most semantically complex</note>

If a sitemap contains more than one pipeline, the request is given to
the first one, then, if not processed, given to the second and so forth
until the end.

If the end is reached with no processing, a '404' HTTP error is thrown.

          ----

A pipeline is a description of the processing stages required to handle
the requests associated to the covered request space.

A pipeline is made of an ordered collection of pipeline components and
one 'error-handler'.

There are 3 groups of components:

 - direct -> generator, transformer, serializer, aggregator, reader
 - indirect -> action
 - support -> matcher, selector, mount, redirector

Each request processed by a sitemap must result in, at least, one (and
only one) generator and one (and only one) serializer performing the
operation.

The 'error-handler' should be considered as a parallel pipeline
triggered when some error is generated during the request processing.
This is a special pipeline where no generator is required since there is
a implicit error-generator.

          ----

A pipeline may contain *all* components at the first level.

<note who="sm">having 'mount' and 'redirect-to' as first level
components doesn't make really sense, but it doesn't hurt either so why
should we limit it?</note>

The order of the components defines their processing precedence.

Pipeline processing is started from the top-most component and stopped
when one of the 'ending' components are found:

 - serializer, reader

If an error is triggered during pipeline processing, the output is
detached from the pipeline and attached to the error-handling
subpipeline.

<question who="SM">what happens when the client has already received
part of the request (say during aggregation?)</question>


                                    - o -


Empty components
----------------

Generators:
Transformers:
Serializers:
Readers:
Mounters:
Redirectors:

<note who="sm">will have to fill these</note>

                                    - o -


Nesting components
------------------

+ Aggregators are special generators that merge more content parts into
one. Each part can be seen as a pipeline fragment. Allowed components
are:

 - generator
 - transformer
 - selector

all other components are not allowed. Example of use is:

 <aggregate>
  <part>
    ...
  </part>
  <part>
    ...
  </part>
 </aggregate>

<question>should we allow matchers in aggregator's parts?</question>

+ Selectors are conditional components that route the processing from
two or more choices depending on some selecting logic. They can include
all components, unless they are inside an aggregator, then they can
include only generators and transformers. Example of use is:

 <select>
  <when test="...">
    ...
  </when>
  <when test="...">
    ...
  </when>
  <otherwise>
   ...
  </otherwise>
 </select>

+ Matchers are conditional components that 'intercept' the request and
route it to the pipeline components they include. Moreover, they are
able to pass environment tokens to the included components. Matchers may
contain all components. Example of use:

 <match pattern="...">
   ...
 </match>

+ Actions are sideways-operating components that work on the request
environment but don't partecipate directly in the creation of a
pipeline, nor perform pipeline routing. Example of use:

 <act .../>
 
 <act>
   ...
 </act>

<question>why is nesting required for actions? what's the
benefit?</question>


                                    - o -


Token Expansion
---------------

Matchers and Actions can pass tokens to their internal components and
component definitions can use those tokens using expansion rules.

The expansion syntax follows XSLT attribute-value templates and uses the
curly braces syntax, for example:

 <match pattern="something/*">
  <mount src="file://home/www/something/{1}"/>
 </match>

also, in case of nested matchers, an xpath-like syntax is used

 <match pattern="something/*">
  <match type="browser" pattern="name('Mozilla ?\\?*')">
   <mount src="file:///home/www/mozilla-{1}-{2}/{../1}"/> 
  </match>
 </match>

where {1} indicates the first token of the inner-most matcher and {../1}
the token of the outer-most. A deeper example:

 <match type="load" pattern="[0.0|2.0|inf]-[low|medium|high]">
  <match pattern="something/*">
   <match type="browser" pattern="name('Mozilla ?\\?*')">
    <mount src="file:///home/www/{../../1}/mozilla-{1}-{2}/{../1}"/> 
   </match>
  </match>
 </match>

The curly braces must indicate the 'name' of the token that we wish to
expand but this depends on the actual component implementation.


                                    - o -


Resource Views
--------------

Views are pipeline exit points that cut across the request space. They
are similar to the concept of 'aspect' in AOP and represent a way to
group pipeline behaviors that cross-cut resources differently from what
the URI space suggests.

For example, a 'content view' of a resource could return the XML content
before transformations, or a 'schema view' could return a schema that
can be used to validate the content returned in the 'default view'. The
'semantic view' might return a resource description document, the
'hyperlink view' might return the list of hyperlinks that start from
that resource, etc...

Views describe terminating pipeline fragments and must not contain a
generator since it's the original pipe that performs as generator. 

They can be pictured as parallel pipelines explicitly triggered by user
request (unlike the semantically equivalent pipeline error-handler which
is triggered by processing errors)

Each view indicates its name and the location where it starts. The
pipeline fragment 'before' this location acts as a 'generator' for the
view.

Locations can be from ordinal locations (first|last) or named locations.
In this case, the 'label' attribute is used to indicate the location.

In the example sitemap:

 <map:views>
  <map:view name="view1" from-position="content">
   <map:serialize type="s2"/> 
  </map:view>

  <map:view name="view2" from-position="last">
   <map:transform type="t3" src="..."/> 
   <map:serialize type="s3"/> 
  </map:view>
 </map:views>

 <map:pipelines>
  <map:pipeline>
   <map:generator type="g1" src="..."/>
   <map:transformer type="t1" src="..." map:label="content"/>
   <map:transformer type="t2" src="..."/>
   <map:serializer type="s1"/>
  </map:pipeline>
 </map:pipelines>

for any URI the views will do

 default (no view):   g1 -> t1 -> t2 -> s1
 view1:               g1 -> t1 -> s2
 view2:               g1 -> t1 -> t3 -> s3

NOTE: the ordinal positions mean:

 + first ----> stop right after the generator
 + last -----> stop right before the serializer

and do not actually mean to stop at the first or last component of the
pipeline.


                                    - o -

Structure Validation
--------------------

The sitemap should be checked for structure inconsistencies or mistakes
at load-time.

Checks include:

 1) all handled requests must pass thru a (generator|serializer) pair or
a reader. 

 2) readers must never find the other direct components during a
pipeline processing.

 3) mount and redirect-to must never be found after a direct component.

                                    - o -

complex example of a valid pipeline (without error-handing):

 <pipeline>
  <act/>
  <select>
   <when>
    <match>
     <generate/>
     <act/>
     <serialize/>
    </match>
   </when>
   <when>
    <act>
     <mount/>
    </act>
   </when>
   <otherwise>
    <read/>
   </otherwise>
  </select>
  <act>
   <generate/>
  </act>
  <match>
   <transform>
  </match>
  <serialize/>
 </pipeline>

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<[EMAIL PROTECTED]>                             Friedrich Nietzsche
--------------------------------------------------------------------


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]

Reply via email to