[DAISY] Updated: Cocoon Sitemap internals

daisy Thu, 06 Oct 2005 04:59:28 -0700

A document has been updated:

http://cocoon.zones.apache.org/daisy/documentation/732.html


Document ID: 732
Branch: main
Language: default
Name: Cocoon Sitemap internals (unchanged)
Document Type: Document (unchanged)
Updated on: 10/6/05 11:55:20 AM
Updated by: Helma van der Linden

A new version has been created, state: publish

Parts
=====
Content
-------
This part has been updated.
Mime type: text/xml (unchanged)
File name:  (unchanged)
Size: 10958 bytes (previous version: 15623 bytes)
Content diff:
(24 equal lines skipped)
    <p>In Cocoon 2.2 the sitemap is internally represented by a tree that 
contains a
    node for each matcher, generator, transformer, serializer and other 
components
    used in the sitemap. This process is executed at Cocoon startup and each 
time
--- the sitemap is changed and needs to be reloaded.<br/>
--- The actual process is done by the TreeProcessor. It builds an sitemap object
--- tree and creates a ServiceManager. This is done for each sitemap and 
subsitemap.
--- </p>
+++ the sitemap is changed and needs to be reloaded. This is done for each 
sitemap
+++ and subsitemap.</p>
    
    <p class="note">Here "tree" means an Abstract Syntax Tree as is commonly 
meant
    in parsers: a tree of "executable objects", which is built by parsing the
(58 equal lines skipped)
    
    <p>We will now go over the process again but this time in more detail.</p>
    
+++ <h2>Background information</h2>
+++ 
+++ <p>The main idea of the TreeProcessor is that each kind of instruction (e.g.
+++ &lt;map:act&gt;, &lt;map:generate&gt;, etc) is described by two classes 
:</p>
+++ 
+++ <ul>
+++ <li>a ProcessingNode, the runtime object that will execute the 
instruction,</li>
+++ <li>a ProcessingNodeBuilder, responsible for creating the ProcessingNode 
with
+++ the appropriate data and/or childnodes, extracted from attributes, child
+++ elements, etc.</li>
+++ </ul>
+++ 
+++ <p>Implementing the sitemap language then translates into writing the
+++ appropriate ProcessingNodeBuilder classes for all statements of the 
language.
+++ Although we only have one language, the design was done with using different
+++ languages in mind, which allows for easy extensibility.</p>
+++ 
+++ <p>The whole configuration document is actually a ComponentSelector for
+++ TreeBuilder implementations. The SitemapLanguage class is the 
implementation of
+++ TreeBuilder for the sitemap language. A TreeBuilder builds a processing node
+++ tree based on a file (e.g. sitemap.xmap) that is read in an Avalon 
configuration
+++ (this was chosen for its ease of use compared to raw DOM).</p>
+++ 
+++ <h3>Roles, selectors and &lt;map:components&gt;</h3>
+++ 
+++ <p>The &lt;map:components&gt; section of a sitemap is used to configure a
+++ ComponentManager (child of either the parent sitemap's manager or the main
+++ manager), and the &lt;roles&gt; section of the TreeProcessor configuration
+++ defines a RoleSelector that is used by this manager. For the sitemap, it 
defines
+++ the shorthands that will map &lt;map:generators&gt;, &lt;map:selectors&gt;,
+++ etc., to a special "ComponentsSelector" (yes, the name could be better).</p>
+++ 
+++ <p>This ComponentsSelector handles the &lt;map:components&gt; syntax ("src" 
and
+++ not "class", etc.), and holds the "default" attribute, view labels and mime
+++ types for each hint (these are not known by the components themselves).</p>
+++ 
    <h2>Phase 1: Build the sitemap tree</h2>
    
--- <p>The TreeProcessor is set to get the Processor role in the cocoon.roles 
file.
--- <br/>
--- During the configuration of the TreeProcessor an ExtendedComponentSelector
--- (builderSelector) is set up using the configuration file
--- "treeprocessor-builtins.xml".</p>
+++ <p>While calling TreeProcessor.process(environment), i.e. the method that 
takes
+++ the environment, applies the sitemap on it and produces the output, the
+++ following things happen:</p>
    
--- <p>&gt; <br/>
--- &gt; While calling TreeProcessor.process(environment), i.e. the method that
+++ <ul>
+++ <li>The method setupRootNode is called (if necesary) and the 
builderSelector is
+++ used to get a TreeBuilder (builder). The build method on the builder is 
called
+++ with the sitemap as argument and a tree of ProcessingNodes corresponding to 
the
+++ sitemap is returned.</li>
+++ <li>The sitemap is then executed by calling the invoke method for the root 
node.
+++ </li>
+++ </ul>
+++ 
+++ <p>Within the DefaultTreeBuilder (during execution of the build method) a
+++ RoleManager is set up based on the "roles" section of
+++ "treeprocessor-builtins.xml" and a ExtendedComponentSelector is set up 
based on
+++ the "nodes" section. The "nodes" section associates the sitemap concepts to 
the
+++ appropriate ProcessingNodeBuilders. It also configures a 
ProcessingNodeBuilder
+++ so that it knows what type of children it is allowed to have and which ones 
are
+++ forbidden.</p>
+++ 
+++ <p>The build process starts (in the method createTree) by creating the
+++ ProcessingNodeBuilder (rootBuilder) that corresponds to the root element in 
the
+++ sitemap, associate the rootBuilder to the current TreeBuilder and call the
+++ rootBuilder.buildNode method with the configuration tree created from the
+++ sitemap.</p>
+++ 
+++ <p>The FooNodeBuilder.buildNode method creates and returns a FooNode object
    <br/>
--- &gt; takes the environment, applies the sitemap on it and produces the  
output,
--- <br/>
--- &gt; the following things happen:<br/>
--- &gt; <br/>
--- &gt; * The method setupRootNode is called (if necesary) and the<br/>
--- &gt; builderSelector is used to get a TreeBuilder (builder). The build 
method
--- <br/>
--- &gt; on the builder is called with the sitemap as argument and a tree of 
<br/>
--- &gt; ProcessingNodes corresponding to the sitemap is returned.<br/>
--- &gt; <br/>
--- &gt; * The sitemap is then executed by calling the invoke method for the 
root
--- <br/>
--- &gt; node.<br/>
--- &gt; <br/>
--- &gt; Building the tree<br/>
--- &gt; -----------------<br/>
--- &gt; <br/>
--- &gt; In Cocoon using "treeprocessor-builtins.xml" SitemapLanguage that  
extends
--- <br/>
--- &gt; DefaultTreeBuilder is used as TreeBuilder. Within the<br/>
--- &gt; DefaultTreeBuilder (during execution of the build method) a RoleManager
--- <br/>
--- &gt; is set up based on the "roles" section of "treeprocessor-builtins.xml"
--- <br/>
--- &gt; and a ExtendedComponentSelector is set up based on the "nodes" section.
--- <br/>
--- &gt; The "nodes" section associates the sitemap concepts to the appropriate
--- <br/>
--- &gt; ProcessingNodeBuilders. It also configures a ProcessingNodeBuilder so
--- <br/>
--- &gt; that it knows what type of children it is allowed to have and which 
ones
--- <br/>
--- &gt; that are forbidden.<br/>
--- &gt; <br/>
--- &gt; The build process starts (in the method createTree) by creating the 
<br/>
--- &gt; ProcessingNodeBuilder (rootBuilder) that corresponds to the root 
element
--- <br/>
--- &gt; in the sitemap, associate the rootBuilder to the current TreeBuilder 
and
--- <br/>
--- &gt; call the rootBuilder.buildNode method with the configuration tree  
created
--- <br/>
--- &gt; from the sitemap.<br/>
--- &gt; <br/>
--- &gt; The FooNodeBuilder.buildNode method creates and returns a FooNode 
object
--- <br/>
    &gt; and recursevly creates the child nodes of the object by creating 
and<br/>
    &gt; executing the corresponding builder objects.<br/>
    &gt; <br/>
(7 equal lines skipped)
    &gt; stored in the context object (other things happens as well). When a 
<br/>
    &gt; SerializeNode is invoked, the current Pipeline is proccesed and the 
<br/>
    &gt; output is stored in the environment.<br/>
--- &gt; <br/>
--- &gt; ----------------------------------<br/>
--- &gt; <br/>
--- &gt; &lt;sidenote&gt;<br/>
--- &gt; I builded a Cocoon inspired signal processing framework about a year 
ago
--- <br/>
--- &gt; and tried to reuse Sylvain's framework. While most of it is very<br/>
--- &gt; general, there are some Cocoon specific details in the Context and 
<br/>
--- &gt; Environment interfaces, so I ended up in building something similar but
--- <br/>
--- &gt; simpler instead.<br/>
--- &gt; &lt;/sidenote&gt;<br/>
--- &gt; <br/>
--- &gt; HTH<br/>
--- &gt; <br/>
--- &gt; /Daniel<br/>
--- &gt; <br/>
    &gt;</p>
    
--- <p>Nice explanation, Daniel! I'm happy to see that other people understand
--- <br/>
--- this.</p>
--- 
--- <p>However, I'd like to add some background to this to explain why it does
--- <br/>
--- work this way, some additional details and what we could eventually <br/>
--- refactor to ease the migration to Fortress.</p>
--- 
--- <p>I started the TreeProcessor for two reasons.</p>
--- 
--- <p>The first reason was that the sitemap engine at that time was compiled 
<br/>
--- into a Java class like XSP. But the sitemap logicsheet was very complex 
<br/>
--- and recompiling a large sitemap took ages (more than 20 seconds on the <br/>
--- samples sitemap), leading to painful try/fail cycles. We needed <br/>
--- something faster.</p>
--- 
--- <p>The second reason was that at that time (autumn 2001), a number of RTs 
<br/>
--- were written related to what we called "flowmaps" and later led to <br/>
--- flowscript. These RTs were describing new ways to build a pipeline to <br/>
--- take flow into account, but no real code was written to test these <br/>
--- ideas, because deeply changing the way the sitemap code was generated <br/>
--- was very painful: finding its way into the 2000-lines XSLT was not easy.</p>
--- 
--- <p>So I decided to consider another approach, based on an evaluation tree 
<br/>
--- (hence TreeProcessor), each node in the tree corresponding to a xxxmap <br/>
--- instruction (sitemap or flowmap).</p>
--- 
--- <p>An additional motivation for me was that it would require me to heavily
--- <br/>
--- use the Avalon concepts and therefore increase my knowledge in this <br/>
--- area. This was mostly written at home, and my wife deserves many thanks, 
<br/>
--- because this thing took my brain day and night for more than 2 months 
;-)</p>
--- 
--- <p>The main idea of the TreeProcessor is that each kind of instruction <br/>
--- (e.g. &lt;map:act&gt;, &lt;map:generate&gt;, etc) is described by two 
classes :
--- <br/>
--- - a ProcessingNode, the runtime object that will execute the 
instruction,<br/>
--- - a ProcessingNodeBuilder, responsible for creating the ProcessingNode <br/>
--- with the appropriate data and/or childnodes, extracted from attributes, 
<br/>
--- child elements, etc.</p>
--- 
--- <p>Implementing the sitemap language then translates into writing the <br/>
--- appropriate ProcessingNodeBuilder classes for all statements of the <br/>
--- language. But since we were discussing flowmaps and other pipeline <br/>
--- construction approaches, I wanted this to be easily extensible, and even 
<br/>
--- allow the simultaneous use of different languages in the system <br/>
--- (sitemap/flowmap). This is why &lt;map:mount&gt; supports an additional 
<br/>
--- undocumented and never used "language" attribute (see MountNodeBuilder)</p>
--- 
--- <p>So the TreeProcessor configuration contains the definition of <br/>
--- TreeBuilder implementations for various "languages", the sitemap being <br/>
--- the only one we have today. The whole configuration document is actually 
<br/>
--- a ComponentSelector for TreeBuilder implementations. The SitemapLanguage 
<br/>
--- class is the implementation of TreeBuilder for the sitemap language. A <br/>
--- TreeBuilder builds a processing node tree based on a file (e.g. <br/>
--- sitemap.xmap) that is read in an Avalon configuration (this was chosen <br/>
--- for its ease of use compared to raw DOM).</p>
--- 
--- <p>&lt;fortress-migration&gt;<br/>
--- Obviously, this initial selector can be removed and the sitemap language 
<br/>
--- be the only one available, as we now have the flowscript and it's very <br/>
--- unlikely that we will redesign a new pipeline language in the near (or <br/>
--- even distant) future.<br/>
--- &lt;/fortress-migration&gt;</p>
--- 
--- <p>Roles, selectors and &lt;map:components&gt;<br/>
--- -------------------------------------</p>
--- 
--- <p>The &lt;map:components&gt; section of a sitemap is used to configure a 
<br/>
--- ComponentManager (child of either the parent sitemap's manager or the <br/>
--- main manager), and the &lt;roles&gt; section of the TreeProcessor <br/>
--- configuration defines a RoleSelector that is used by this manager. For <br/>
--- the sitemap, it defines the shorthands that will map &lt;map:generators&gt;,
--- <br/>
--- &lt;map:selectors&gt;, etc, to a special "ComponentsSelector" (yeah, the 
name
--- <br/>
--- could be better).</p>
--- 
    <p>This ComponentsSelector handles the &lt;map:components&gt; syntax ("src" 
and
    <br/>
    not "class", etc), and holds the "default" attribute, view labels and <br/>
    mime types for each hint (these are not know by the components 
themselves).</p>
    
--- <p>&lt;fortress-migration&gt;<br/>
--- AFAIU, Fortress allows defaults for a collection of components <br/>
--- implementing the same role, but I don't know how we can handle the <br/>
--- additional "label" and "mime-type", which are not handled by the <br/>
--- component itself.</p>
--- 
--- <p>Can we imagine a "fake" selector that route calls to select() to the 
<br/>
--- manager and handle these additional information on its own?<br/>
--- &lt;/fortress-migration&gt;</p>
--- 
    <p>Building the processing tree<br/>
    ----------------------------</p>
    
(33 equal lines skipped)
    collected in a list that the TreeProcessor traverses when needed <br/>
    (sitemap change or system disposal).</p>
    
--- <p>Great care has been taken to cleanly separate build-time and run-time 
<br/>
--- code and data, to ensure the smallest memory occupation and the fastest 
<br/>
--- possible execution. This led this intepreted engine to be a bit faster <br/>
--- at runtime than the compiled one (build time is more than 20 times 
faster).</p>
+++ <p>Great care has been taken to cleanly separate build-time and run-time 
code
+++ and data, to ensure the smallest memory occupation and the fastest possible
+++ execution. This led this intepreted engine to be a bit faster at runtime 
than
+++ the compiled one (build time is more than 20 times faster).</p>
    
--- <p>&lt;fortress-migration&gt;<br/>
--- An optimisation that is done and may be relevant to migration to <br/>
--- Fortress is that ThreadSafe components are looked up as part of the tree 
<br/>
--- building and never looked up again later (see e.g. MatchNode). AFAIU, <br/>
--- lifestyle interface no more exist with Fortress, so this optimisation <br/>
--- may be difficult to do, if not impossible.<br/>
--- &lt;/fortress-migration&gt;</p>
+++ <h2>Phase 3: Create the pipeline</h2>
    
--- <p>Building a pipeline<br/>
--- -------------------</p>
+++ <p>When a request has to be processed, the TreeProcessor calls invoke() on 
the
+++ root node of the evaluation tree. This method has two parameters: <br/>
+++ the environment defining the request, and an InvokeContext that mainly 
holds the
+++ pipeline that is being built and the stack of sitemap variables.</p>
    
--- <p>When a request has to be processed, the TreeProcessor calls invoke() on
--- <br/>
--- the root node of the evaluation tree. This method has two parameters: <br/>
--- the environment defining the request, and an InvokeContext that mainly <br/>
--- holds the pipeline that is being built and the stack of sitemap 
variables.</p>
--- 
--- <p>The invoke method executes all processing nodes (depth first) until one
--- <br/>
--- them returns "true", meaning that a pipeline was successfully built. <br/>
+++ <p>The invoke method executes all processing nodes (depth first) until one 
them
+++ returns "true", meaning that a pipeline was successfully built. <br/>
    Examples of nodes that return true are serializers, readers and 
redirect.</p>
    
--- <p>If the environment is external, the pipeline is executed as soon as it 
<br/>
--- is ended (i.e. in the reader or serializer node). But if the environment 
<br/>
--- is internal (i.e. a "cocoon:" source), it is not, meaning the pipeline <br/>
--- is returned to the SitemapSource, ready for later execution if requested 
<br/>
+++ <p>If the environment is external, the pipeline is executed as soon as it is
+++ ended (i.e. in the reader or serializer node). But if the environment <br/>
+++ is internal (i.e. a "cocoon:" source), it is not, meaning the pipeline is
+++ returned to the SitemapSource, ready for later execution if requested <br/>
    so (e.g. by a Source.getInputStream()).</p>
    
--- <p>Phew... I finally explained the whole thing in depth. I'm no more the 
<br/>
--- only one to know ;-)<br/>
--- I'll also put this into the wiki.</p>
--- 
    </body>
    </html>


Fields
======
no changes

Links
=====
no changes

Custom Fields
=============
no changes

Collections
===========
no changes

[DAISY] Updated: Cocoon Sitemap internals

Reply via email to