A document has been updated:
http://cocoon.zones.apache.org/daisy/documentation/732.html
Document ID: 732
Branch: main
Language: default
Name: Cocoon Sitemap internals (unchanged)
Document Type: Document (unchanged)
Updated on: 10/6/05 11:55:20 AM
Updated by: Helma van der Linden
A new version has been created, state: publish
Parts
=====
Content
-------
This part has been updated.
Mime type: text/xml (unchanged)
File name: (unchanged)
Size: 10958 bytes (previous version: 15623 bytes)
Content diff:
(24 equal lines skipped)
<p>In Cocoon 2.2 the sitemap is internally represented by a tree that
contains a
node for each matcher, generator, transformer, serializer and other
components
used in the sitemap. This process is executed at Cocoon startup and each
time
--- the sitemap is changed and needs to be reloaded.<br/>
--- The actual process is done by the TreeProcessor. It builds an sitemap object
--- tree and creates a ServiceManager. This is done for each sitemap and
subsitemap.
--- </p>
+++ the sitemap is changed and needs to be reloaded. This is done for each
sitemap
+++ and subsitemap.</p>
<p class="note">Here "tree" means an Abstract Syntax Tree as is commonly
meant
in parsers: a tree of "executable objects", which is built by parsing the
(58 equal lines skipped)
<p>We will now go over the process again but this time in more detail.</p>
+++ <h2>Background information</h2>
+++
+++ <p>The main idea of the TreeProcessor is that each kind of instruction (e.g.
+++ <map:act>, <map:generate>, etc) is described by two classes
:</p>
+++
+++ <ul>
+++ <li>a ProcessingNode, the runtime object that will execute the
instruction,</li>
+++ <li>a ProcessingNodeBuilder, responsible for creating the ProcessingNode
with
+++ the appropriate data and/or childnodes, extracted from attributes, child
+++ elements, etc.</li>
+++ </ul>
+++
+++ <p>Implementing the sitemap language then translates into writing the
+++ appropriate ProcessingNodeBuilder classes for all statements of the
language.
+++ Although we only have one language, the design was done with using different
+++ languages in mind, which allows for easy extensibility.</p>
+++
+++ <p>The whole configuration document is actually a ComponentSelector for
+++ TreeBuilder implementations. The SitemapLanguage class is the
implementation of
+++ TreeBuilder for the sitemap language. A TreeBuilder builds a processing node
+++ tree based on a file (e.g. sitemap.xmap) that is read in an Avalon
configuration
+++ (this was chosen for its ease of use compared to raw DOM).</p>
+++
+++ <h3>Roles, selectors and <map:components></h3>
+++
+++ <p>The <map:components> section of a sitemap is used to configure a
+++ ComponentManager (child of either the parent sitemap's manager or the main
+++ manager), and the <roles> section of the TreeProcessor configuration
+++ defines a RoleSelector that is used by this manager. For the sitemap, it
defines
+++ the shorthands that will map <map:generators>, <map:selectors>,
+++ etc., to a special "ComponentsSelector" (yes, the name could be better).</p>
+++
+++ <p>This ComponentsSelector handles the <map:components> syntax ("src"
and
+++ not "class", etc.), and holds the "default" attribute, view labels and mime
+++ types for each hint (these are not known by the components themselves).</p>
+++
<h2>Phase 1: Build the sitemap tree</h2>
--- <p>The TreeProcessor is set to get the Processor role in the cocoon.roles
file.
--- <br/>
--- During the configuration of the TreeProcessor an ExtendedComponentSelector
--- (builderSelector) is set up using the configuration file
--- "treeprocessor-builtins.xml".</p>
+++ <p>While calling TreeProcessor.process(environment), i.e. the method that
takes
+++ the environment, applies the sitemap on it and produces the output, the
+++ following things happen:</p>
--- <p>> <br/>
--- > While calling TreeProcessor.process(environment), i.e. the method that
+++ <ul>
+++ <li>The method setupRootNode is called (if necesary) and the
builderSelector is
+++ used to get a TreeBuilder (builder). The build method on the builder is
called
+++ with the sitemap as argument and a tree of ProcessingNodes corresponding to
the
+++ sitemap is returned.</li>
+++ <li>The sitemap is then executed by calling the invoke method for the root
node.
+++ </li>
+++ </ul>
+++
+++ <p>Within the DefaultTreeBuilder (during execution of the build method) a
+++ RoleManager is set up based on the "roles" section of
+++ "treeprocessor-builtins.xml" and a ExtendedComponentSelector is set up
based on
+++ the "nodes" section. The "nodes" section associates the sitemap concepts to
the
+++ appropriate ProcessingNodeBuilders. It also configures a
ProcessingNodeBuilder
+++ so that it knows what type of children it is allowed to have and which ones
are
+++ forbidden.</p>
+++
+++ <p>The build process starts (in the method createTree) by creating the
+++ ProcessingNodeBuilder (rootBuilder) that corresponds to the root element in
the
+++ sitemap, associate the rootBuilder to the current TreeBuilder and call the
+++ rootBuilder.buildNode method with the configuration tree created from the
+++ sitemap.</p>
+++
+++ <p>The FooNodeBuilder.buildNode method creates and returns a FooNode object
<br/>
--- > takes the environment, applies the sitemap on it and produces the
output,
--- <br/>
--- > the following things happen:<br/>
--- > <br/>
--- > * The method setupRootNode is called (if necesary) and the<br/>
--- > builderSelector is used to get a TreeBuilder (builder). The build
method
--- <br/>
--- > on the builder is called with the sitemap as argument and a tree of
<br/>
--- > ProcessingNodes corresponding to the sitemap is returned.<br/>
--- > <br/>
--- > * The sitemap is then executed by calling the invoke method for the
root
--- <br/>
--- > node.<br/>
--- > <br/>
--- > Building the tree<br/>
--- > -----------------<br/>
--- > <br/>
--- > In Cocoon using "treeprocessor-builtins.xml" SitemapLanguage that
extends
--- <br/>
--- > DefaultTreeBuilder is used as TreeBuilder. Within the<br/>
--- > DefaultTreeBuilder (during execution of the build method) a RoleManager
--- <br/>
--- > is set up based on the "roles" section of "treeprocessor-builtins.xml"
--- <br/>
--- > and a ExtendedComponentSelector is set up based on the "nodes" section.
--- <br/>
--- > The "nodes" section associates the sitemap concepts to the appropriate
--- <br/>
--- > ProcessingNodeBuilders. It also configures a ProcessingNodeBuilder so
--- <br/>
--- > that it knows what type of children it is allowed to have and which
ones
--- <br/>
--- > that are forbidden.<br/>
--- > <br/>
--- > The build process starts (in the method createTree) by creating the
<br/>
--- > ProcessingNodeBuilder (rootBuilder) that corresponds to the root
element
--- <br/>
--- > in the sitemap, associate the rootBuilder to the current TreeBuilder
and
--- <br/>
--- > call the rootBuilder.buildNode method with the configuration tree
created
--- <br/>
--- > from the sitemap.<br/>
--- > <br/>
--- > The FooNodeBuilder.buildNode method creates and returns a FooNode
object
--- <br/>
> and recursevly creates the child nodes of the object by creating
and<br/>
> executing the corresponding builder objects.<br/>
> <br/>
(7 equal lines skipped)
> stored in the context object (other things happens as well). When a
<br/>
> SerializeNode is invoked, the current Pipeline is proccesed and the
<br/>
> output is stored in the environment.<br/>
--- > <br/>
--- > ----------------------------------<br/>
--- > <br/>
--- > <sidenote><br/>
--- > I builded a Cocoon inspired signal processing framework about a year
ago
--- <br/>
--- > and tried to reuse Sylvain's framework. While most of it is very<br/>
--- > general, there are some Cocoon specific details in the Context and
<br/>
--- > Environment interfaces, so I ended up in building something similar but
--- <br/>
--- > simpler instead.<br/>
--- > </sidenote><br/>
--- > <br/>
--- > HTH<br/>
--- > <br/>
--- > /Daniel<br/>
--- > <br/>
></p>
--- <p>Nice explanation, Daniel! I'm happy to see that other people understand
--- <br/>
--- this.</p>
---
--- <p>However, I'd like to add some background to this to explain why it does
--- <br/>
--- work this way, some additional details and what we could eventually <br/>
--- refactor to ease the migration to Fortress.</p>
---
--- <p>I started the TreeProcessor for two reasons.</p>
---
--- <p>The first reason was that the sitemap engine at that time was compiled
<br/>
--- into a Java class like XSP. But the sitemap logicsheet was very complex
<br/>
--- and recompiling a large sitemap took ages (more than 20 seconds on the <br/>
--- samples sitemap), leading to painful try/fail cycles. We needed <br/>
--- something faster.</p>
---
--- <p>The second reason was that at that time (autumn 2001), a number of RTs
<br/>
--- were written related to what we called "flowmaps" and later led to <br/>
--- flowscript. These RTs were describing new ways to build a pipeline to <br/>
--- take flow into account, but no real code was written to test these <br/>
--- ideas, because deeply changing the way the sitemap code was generated <br/>
--- was very painful: finding its way into the 2000-lines XSLT was not easy.</p>
---
--- <p>So I decided to consider another approach, based on an evaluation tree
<br/>
--- (hence TreeProcessor), each node in the tree corresponding to a xxxmap <br/>
--- instruction (sitemap or flowmap).</p>
---
--- <p>An additional motivation for me was that it would require me to heavily
--- <br/>
--- use the Avalon concepts and therefore increase my knowledge in this <br/>
--- area. This was mostly written at home, and my wife deserves many thanks,
<br/>
--- because this thing took my brain day and night for more than 2 months
;-)</p>
---
--- <p>The main idea of the TreeProcessor is that each kind of instruction <br/>
--- (e.g. <map:act>, <map:generate>, etc) is described by two
classes :
--- <br/>
--- - a ProcessingNode, the runtime object that will execute the
instruction,<br/>
--- - a ProcessingNodeBuilder, responsible for creating the ProcessingNode <br/>
--- with the appropriate data and/or childnodes, extracted from attributes,
<br/>
--- child elements, etc.</p>
---
--- <p>Implementing the sitemap language then translates into writing the <br/>
--- appropriate ProcessingNodeBuilder classes for all statements of the <br/>
--- language. But since we were discussing flowmaps and other pipeline <br/>
--- construction approaches, I wanted this to be easily extensible, and even
<br/>
--- allow the simultaneous use of different languages in the system <br/>
--- (sitemap/flowmap). This is why <map:mount> supports an additional
<br/>
--- undocumented and never used "language" attribute (see MountNodeBuilder)</p>
---
--- <p>So the TreeProcessor configuration contains the definition of <br/>
--- TreeBuilder implementations for various "languages", the sitemap being <br/>
--- the only one we have today. The whole configuration document is actually
<br/>
--- a ComponentSelector for TreeBuilder implementations. The SitemapLanguage
<br/>
--- class is the implementation of TreeBuilder for the sitemap language. A <br/>
--- TreeBuilder builds a processing node tree based on a file (e.g. <br/>
--- sitemap.xmap) that is read in an Avalon configuration (this was chosen <br/>
--- for its ease of use compared to raw DOM).</p>
---
--- <p><fortress-migration><br/>
--- Obviously, this initial selector can be removed and the sitemap language
<br/>
--- be the only one available, as we now have the flowscript and it's very <br/>
--- unlikely that we will redesign a new pipeline language in the near (or <br/>
--- even distant) future.<br/>
--- </fortress-migration></p>
---
--- <p>Roles, selectors and <map:components><br/>
--- -------------------------------------</p>
---
--- <p>The <map:components> section of a sitemap is used to configure a
<br/>
--- ComponentManager (child of either the parent sitemap's manager or the <br/>
--- main manager), and the <roles> section of the TreeProcessor <br/>
--- configuration defines a RoleSelector that is used by this manager. For <br/>
--- the sitemap, it defines the shorthands that will map <map:generators>,
--- <br/>
--- <map:selectors>, etc, to a special "ComponentsSelector" (yeah, the
name
--- <br/>
--- could be better).</p>
---
<p>This ComponentsSelector handles the <map:components> syntax ("src"
and
<br/>
not "class", etc), and holds the "default" attribute, view labels and <br/>
mime types for each hint (these are not know by the components
themselves).</p>
--- <p><fortress-migration><br/>
--- AFAIU, Fortress allows defaults for a collection of components <br/>
--- implementing the same role, but I don't know how we can handle the <br/>
--- additional "label" and "mime-type", which are not handled by the <br/>
--- component itself.</p>
---
--- <p>Can we imagine a "fake" selector that route calls to select() to the
<br/>
--- manager and handle these additional information on its own?<br/>
--- </fortress-migration></p>
---
<p>Building the processing tree<br/>
----------------------------</p>
(33 equal lines skipped)
collected in a list that the TreeProcessor traverses when needed <br/>
(sitemap change or system disposal).</p>
--- <p>Great care has been taken to cleanly separate build-time and run-time
<br/>
--- code and data, to ensure the smallest memory occupation and the fastest
<br/>
--- possible execution. This led this intepreted engine to be a bit faster <br/>
--- at runtime than the compiled one (build time is more than 20 times
faster).</p>
+++ <p>Great care has been taken to cleanly separate build-time and run-time
code
+++ and data, to ensure the smallest memory occupation and the fastest possible
+++ execution. This led this intepreted engine to be a bit faster at runtime
than
+++ the compiled one (build time is more than 20 times faster).</p>
--- <p><fortress-migration><br/>
--- An optimisation that is done and may be relevant to migration to <br/>
--- Fortress is that ThreadSafe components are looked up as part of the tree
<br/>
--- building and never looked up again later (see e.g. MatchNode). AFAIU, <br/>
--- lifestyle interface no more exist with Fortress, so this optimisation <br/>
--- may be difficult to do, if not impossible.<br/>
--- </fortress-migration></p>
+++ <h2>Phase 3: Create the pipeline</h2>
--- <p>Building a pipeline<br/>
--- -------------------</p>
+++ <p>When a request has to be processed, the TreeProcessor calls invoke() on
the
+++ root node of the evaluation tree. This method has two parameters: <br/>
+++ the environment defining the request, and an InvokeContext that mainly
holds the
+++ pipeline that is being built and the stack of sitemap variables.</p>
--- <p>When a request has to be processed, the TreeProcessor calls invoke() on
--- <br/>
--- the root node of the evaluation tree. This method has two parameters: <br/>
--- the environment defining the request, and an InvokeContext that mainly <br/>
--- holds the pipeline that is being built and the stack of sitemap
variables.</p>
---
--- <p>The invoke method executes all processing nodes (depth first) until one
--- <br/>
--- them returns "true", meaning that a pipeline was successfully built. <br/>
+++ <p>The invoke method executes all processing nodes (depth first) until one
them
+++ returns "true", meaning that a pipeline was successfully built. <br/>
Examples of nodes that return true are serializers, readers and
redirect.</p>
--- <p>If the environment is external, the pipeline is executed as soon as it
<br/>
--- is ended (i.e. in the reader or serializer node). But if the environment
<br/>
--- is internal (i.e. a "cocoon:" source), it is not, meaning the pipeline <br/>
--- is returned to the SitemapSource, ready for later execution if requested
<br/>
+++ <p>If the environment is external, the pipeline is executed as soon as it is
+++ ended (i.e. in the reader or serializer node). But if the environment <br/>
+++ is internal (i.e. a "cocoon:" source), it is not, meaning the pipeline is
+++ returned to the SitemapSource, ready for later execution if requested <br/>
so (e.g. by a Source.getInputStream()).</p>
--- <p>Phew... I finally explained the whole thing in depth. I'm no more the
<br/>
--- only one to know ;-)<br/>
--- I'll also put this into the wiki.</p>
---
</body>
</html>
Fields
======
no changes
Links
=====
no changes
Custom Fields
=============
no changes
Collections
===========
no changes