The Cocoon sitemap concept that is currently implemented is more than two years old. Not much, but older than some W3C recommendations for the XML model.
The sitemap was developped in an incremental way over these two years. It started as a draft and things were added/moved/patched as they were used, implemented and challenged with real-life operation. I think it's now time to go back and summarize all this. DISCLAIMER: the words below *DO* *NOT* want to be a description of the *actual* behavior of the sitemap implementation, but a description of what I find good design. I'd like to discuss over this document before attacking the implementation issues and, more important, the eventual back incompatile changes. IMPORTANT: back compatibility will be judged more important than design elegance at this time, although I'll make a great effort to maximize both. - o - A sitemap is a description of the processing associated with handling a partition of the request space. <note>Note that 'request space' is more than 'URI space' since the same URIs could belong to different sitemaps depending on some other request or environment parameters.</note> A sitemap contains: - components description - resource views - pipeline resources - action sets - pipelines <note>I'll concentrate on the 'pipelines' and 'views' sections which are the most semantically complex</note> If a sitemap contains more than one pipeline, the request is given to the first one, then, if not processed, given to the second and so forth until the end. If the end is reached with no processing, a '404' HTTP error is thrown. ---- A pipeline is a description of the processing stages required to handle the requests associated to the covered request space. A pipeline is made of an ordered collection of pipeline components and one 'error-handler'. There are 3 groups of components: - direct -> generator, transformer, serializer, aggregator, reader - indirect -> action - support -> matcher, selector, mount, redirector Each request processed by a sitemap must result in, at least, one (and only one) generator and one (and only one) serializer performing the operation. The 'error-handler' should be considered as a parallel pipeline triggered when some error is generated during the request processing. This is a special pipeline where no generator is required since there is a implicit error-generator. ---- A pipeline may contain *all* components at the first level. <note who="sm">having 'mount' and 'redirect-to' as first level components doesn't make really sense, but it doesn't hurt either so why should we limit it?</note> The order of the components defines their processing precedence. Pipeline processing is started from the top-most component and stopped when one of the 'ending' components are found: - serializer, reader If an error is triggered during pipeline processing, the output is detached from the pipeline and attached to the error-handling subpipeline. <question who="SM">what happens when the client has already received part of the request (say during aggregation?)</question> - o - Empty components ---------------- Generators: Transformers: Serializers: Readers: Mounters: Redirectors: <note who="sm">will have to fill these</note> - o - Nesting components ------------------ + Aggregators are special generators that merge more content parts into one. Each part can be seen as a pipeline fragment. Allowed components are: - generator - transformer - selector all other components are not allowed. Example of use is: <aggregate> <part> ... </part> <part> ... </part> </aggregate> <question>should we allow matchers in aggregator's parts?</question> + Selectors are conditional components that route the processing from two or more choices depending on some selecting logic. They can include all components, unless they are inside an aggregator, then they can include only generators and transformers. Example of use is: <select> <when test="..."> ... </when> <when test="..."> ... </when> <otherwise> ... </otherwise> </select> + Matchers are conditional components that 'intercept' the request and route it to the pipeline components they include. Moreover, they are able to pass environment tokens to the included components. Matchers may contain all components. Example of use: <match pattern="..."> ... </match> + Actions are sideways-operating components that work on the request environment but don't partecipate directly in the creation of a pipeline, nor perform pipeline routing. Example of use: <act .../> <act> ... </act> <question>why is nesting required for actions? what's the benefit?</question> - o - Token Expansion --------------- Matchers and Actions can pass tokens to their internal components and component definitions can use those tokens using expansion rules. The expansion syntax follows XSLT attribute-value templates and uses the curly braces syntax, for example: <match pattern="something/*"> <mount src="file://home/www/something/{1}"/> </match> also, in case of nested matchers, an xpath-like syntax is used <match pattern="something/*"> <match type="browser" pattern="name('Mozilla ?\\?*')"> <mount src="file:///home/www/mozilla-{1}-{2}/{../1}"/> </match> </match> where {1} indicates the first token of the inner-most matcher and {../1} the token of the outer-most. A deeper example: <match type="load" pattern="[0.0|2.0|inf]-[low|medium|high]"> <match pattern="something/*"> <match type="browser" pattern="name('Mozilla ?\\?*')"> <mount src="file:///home/www/{../../1}/mozilla-{1}-{2}/{../1}"/> </match> </match> </match> The curly braces must indicate the 'name' of the token that we wish to expand but this depends on the actual component implementation. - o - Resource Views -------------- Views are pipeline exit points that cut across the request space. They are similar to the concept of 'aspect' in AOP and represent a way to group pipeline behaviors that cross-cut resources differently from what the URI space suggests. For example, a 'content view' of a resource could return the XML content before transformations, or a 'schema view' could return a schema that can be used to validate the content returned in the 'default view'. The 'semantic view' might return a resource description document, the 'hyperlink view' might return the list of hyperlinks that start from that resource, etc... Views describe terminating pipeline fragments and must not contain a generator since it's the original pipe that performs as generator. They can be pictured as parallel pipelines explicitly triggered by user request (unlike the semantically equivalent pipeline error-handler which is triggered by processing errors) Each view indicates its name and the location where it starts. The pipeline fragment 'before' this location acts as a 'generator' for the view. Locations can be from ordinal locations (first|last) or named locations. In this case, the 'label' attribute is used to indicate the location. In the example sitemap: <map:views> <map:view name="view1" from-position="content"> <map:serialize type="s2"/> </map:view> <map:view name="view2" from-position="last"> <map:transform type="t3" src="..."/> <map:serialize type="s3"/> </map:view> </map:views> <map:pipelines> <map:pipeline> <map:generator type="g1" src="..."/> <map:transformer type="t1" src="..." map:label="content"/> <map:transformer type="t2" src="..."/> <map:serializer type="s1"/> </map:pipeline> </map:pipelines> for any URI the views will do default (no view): g1 -> t1 -> t2 -> s1 view1: g1 -> t1 -> s2 view2: g1 -> t1 -> t3 -> s3 NOTE: the ordinal positions mean: + first ----> stop right after the generator + last -----> stop right before the serializer and do not actually mean to stop at the first or last component of the pipeline. - o - Structure Validation -------------------- The sitemap should be checked for structure inconsistencies or mistakes at load-time. Checks include: 1) all handled requests must pass thru a (generator|serializer) pair or a reader. 2) readers must never find the other direct components during a pipeline processing. 3) mount and redirect-to must never be found after a direct component. - o - complex example of a valid pipeline (without error-handing): <pipeline> <act/> <select> <when> <match> <generate/> <act/> <serialize/> </match> </when> <when> <act> <mount/> </act> </when> <otherwise> <read/> </otherwise> </select> <act> <generate/> </act> <match> <transform> </match> <serialize/> </pipeline> -- Stefano Mazzocchi One must still have chaos in oneself to be able to give birth to a dancing star. <[EMAIL PROTECTED]> Friedrich Nietzsche -------------------------------------------------------------------- --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]