A document has been updated:
http://cocoon.zones.apache.org/daisy/documentation/732.html
Document ID: 732
Branch: main
Language: default
Name: Cocoon Sitemap internals (unchanged)
Document Type: Document (unchanged)
Updated on: 10/6/05 9:03:37 AM
Updated by: Helma van der Linden
A new version has been created, state: publish
Parts
=====
Content
-------
This part has been updated.
Mime type: text/xml (unchanged)
File name: (unchanged)
Size: 15623 bytes (previous version: 3528 bytes)
Content diff:
(5 equal lines skipped)
- links to apidocs of classes mentioned</p>
<p>This information pertains to both Cocoon 2.1 and Cocoon 2.2, although the
--- classnames mentioned are Cocoon 2.2.</p>
+++ classnames mentioned are Cocoon 2.2. The process will be described in
general
+++ first. This allows users to get a general grasp of how the process works.
After
+++ that the process is described in more detail including classnames.</p>
<h1>Sitemap processing</h1>
(30 equal lines skipped)
request. This results in a single path through the tree with only a
generator,
transformers and a serializer as components.</p>
--- <h2>Phase 3: Executing the pipeline</h2>
+++ <h2>Phase 3: Execute the pipeline</h2>
<p>Once the pipeline is built in the previous phase, its execution is
invoked by
calling generator.generate().</p>
(37 equal lines skipped)
<p><img src="daisy:737"/></p>
+++ <h1>In-depth explanation</h1>
+++
+++ <p>We will now go over the process again but this time in more detail.</p>
+++
+++ <h2>Phase 1: Build the sitemap tree</h2>
+++
+++ <p>The TreeProcessor is set to get the Processor role in the cocoon.roles
file.
+++ <br/>
+++ During the configuration of the TreeProcessor an ExtendedComponentSelector
+++ (builderSelector) is set up using the configuration file
+++ "treeprocessor-builtins.xml".</p>
+++
+++ <p>> <br/>
+++ > While calling TreeProcessor.process(environment), i.e. the method that
+++ <br/>
+++ > takes the environment, applies the sitemap on it and produces the
output,
+++ <br/>
+++ > the following things happen:<br/>
+++ > <br/>
+++ > * The method setupRootNode is called (if necesary) and the<br/>
+++ > builderSelector is used to get a TreeBuilder (builder). The build
method
+++ <br/>
+++ > on the builder is called with the sitemap as argument and a tree of
<br/>
+++ > ProcessingNodes corresponding to the sitemap is returned.<br/>
+++ > <br/>
+++ > * The sitemap is then executed by calling the invoke method for the
root
+++ <br/>
+++ > node.<br/>
+++ > <br/>
+++ > Building the tree<br/>
+++ > -----------------<br/>
+++ > <br/>
+++ > In Cocoon using "treeprocessor-builtins.xml" SitemapLanguage that
extends
+++ <br/>
+++ > DefaultTreeBuilder is used as TreeBuilder. Within the<br/>
+++ > DefaultTreeBuilder (during execution of the build method) a RoleManager
+++ <br/>
+++ > is set up based on the "roles" section of "treeprocessor-builtins.xml"
+++ <br/>
+++ > and a ExtendedComponentSelector is set up based on the "nodes" section.
+++ <br/>
+++ > The "nodes" section associates the sitemap concepts to the appropriate
+++ <br/>
+++ > ProcessingNodeBuilders. It also configures a ProcessingNodeBuilder so
+++ <br/>
+++ > that it knows what type of children it is allowed to have and which
ones
+++ <br/>
+++ > that are forbidden.<br/>
+++ > <br/>
+++ > The build process starts (in the method createTree) by creating the
<br/>
+++ > ProcessingNodeBuilder (rootBuilder) that corresponds to the root
element
+++ <br/>
+++ > in the sitemap, associate the rootBuilder to the current TreeBuilder
and
+++ <br/>
+++ > call the rootBuilder.buildNode method with the configuration tree
created
+++ <br/>
+++ > from the sitemap.<br/>
+++ > <br/>
+++ > The FooNodeBuilder.buildNode method creates and returns a FooNode
object
+++ <br/>
+++ > and recursevly creates the child nodes of the object by creating
and<br/>
+++ > executing the corresponding builder objects.<br/>
+++ > <br/>
+++ > Executing the tree<br/>
+++ > ------------------<br/>
+++ > <br/>
+++ > While (recursevly) executing the invoke(environment, context) method
for
+++ <br/>
+++ > the node objects in the tree a Pipeline object is constructed that is
+++ <br/>
+++ > stored in the context object (other things happens as well). When a
<br/>
+++ > SerializeNode is invoked, the current Pipeline is proccesed and the
<br/>
+++ > output is stored in the environment.<br/>
+++ > <br/>
+++ > ----------------------------------<br/>
+++ > <br/>
+++ > <sidenote><br/>
+++ > I builded a Cocoon inspired signal processing framework about a year
ago
+++ <br/>
+++ > and tried to reuse Sylvain's framework. While most of it is very<br/>
+++ > general, there are some Cocoon specific details in the Context and
<br/>
+++ > Environment interfaces, so I ended up in building something similar but
+++ <br/>
+++ > simpler instead.<br/>
+++ > </sidenote><br/>
+++ > <br/>
+++ > HTH<br/>
+++ > <br/>
+++ > /Daniel<br/>
+++ > <br/>
+++ ></p>
+++
+++ <p>Nice explanation, Daniel! I'm happy to see that other people understand
+++ <br/>
+++ this.</p>
+++
+++ <p>However, I'd like to add some background to this to explain why it does
+++ <br/>
+++ work this way, some additional details and what we could eventually <br/>
+++ refactor to ease the migration to Fortress.</p>
+++
+++ <p>I started the TreeProcessor for two reasons.</p>
+++
+++ <p>The first reason was that the sitemap engine at that time was compiled
<br/>
+++ into a Java class like XSP. But the sitemap logicsheet was very complex
<br/>
+++ and recompiling a large sitemap took ages (more than 20 seconds on the <br/>
+++ samples sitemap), leading to painful try/fail cycles. We needed <br/>
+++ something faster.</p>
+++
+++ <p>The second reason was that at that time (autumn 2001), a number of RTs
<br/>
+++ were written related to what we called "flowmaps" and later led to <br/>
+++ flowscript. These RTs were describing new ways to build a pipeline to <br/>
+++ take flow into account, but no real code was written to test these <br/>
+++ ideas, because deeply changing the way the sitemap code was generated <br/>
+++ was very painful: finding its way into the 2000-lines XSLT was not easy.</p>
+++
+++ <p>So I decided to consider another approach, based on an evaluation tree
<br/>
+++ (hence TreeProcessor), each node in the tree corresponding to a xxxmap <br/>
+++ instruction (sitemap or flowmap).</p>
+++
+++ <p>An additional motivation for me was that it would require me to heavily
+++ <br/>
+++ use the Avalon concepts and therefore increase my knowledge in this <br/>
+++ area. This was mostly written at home, and my wife deserves many thanks,
<br/>
+++ because this thing took my brain day and night for more than 2 months
;-)</p>
+++
+++ <p>The main idea of the TreeProcessor is that each kind of instruction <br/>
+++ (e.g. <map:act>, <map:generate>, etc) is described by two
classes :
+++ <br/>
+++ - a ProcessingNode, the runtime object that will execute the
instruction,<br/>
+++ - a ProcessingNodeBuilder, responsible for creating the ProcessingNode <br/>
+++ with the appropriate data and/or childnodes, extracted from attributes,
<br/>
+++ child elements, etc.</p>
+++
+++ <p>Implementing the sitemap language then translates into writing the <br/>
+++ appropriate ProcessingNodeBuilder classes for all statements of the <br/>
+++ language. But since we were discussing flowmaps and other pipeline <br/>
+++ construction approaches, I wanted this to be easily extensible, and even
<br/>
+++ allow the simultaneous use of different languages in the system <br/>
+++ (sitemap/flowmap). This is why <map:mount> supports an additional
<br/>
+++ undocumented and never used "language" attribute (see MountNodeBuilder)</p>
+++
+++ <p>So the TreeProcessor configuration contains the definition of <br/>
+++ TreeBuilder implementations for various "languages", the sitemap being <br/>
+++ the only one we have today. The whole configuration document is actually
<br/>
+++ a ComponentSelector for TreeBuilder implementations. The SitemapLanguage
<br/>
+++ class is the implementation of TreeBuilder for the sitemap language. A <br/>
+++ TreeBuilder builds a processing node tree based on a file (e.g. <br/>
+++ sitemap.xmap) that is read in an Avalon configuration (this was chosen <br/>
+++ for its ease of use compared to raw DOM).</p>
+++
+++ <p><fortress-migration><br/>
+++ Obviously, this initial selector can be removed and the sitemap language
<br/>
+++ be the only one available, as we now have the flowscript and it's very <br/>
+++ unlikely that we will redesign a new pipeline language in the near (or <br/>
+++ even distant) future.<br/>
+++ </fortress-migration></p>
+++
+++ <p>Roles, selectors and <map:components><br/>
+++ -------------------------------------</p>
+++
+++ <p>The <map:components> section of a sitemap is used to configure a
<br/>
+++ ComponentManager (child of either the parent sitemap's manager or the <br/>
+++ main manager), and the <roles> section of the TreeProcessor <br/>
+++ configuration defines a RoleSelector that is used by this manager. For <br/>
+++ the sitemap, it defines the shorthands that will map <map:generators>,
+++ <br/>
+++ <map:selectors>, etc, to a special "ComponentsSelector" (yeah, the
name
+++ <br/>
+++ could be better).</p>
+++
+++ <p>This ComponentsSelector handles the <map:components> syntax ("src"
and
+++ <br/>
+++ not "class", etc), and holds the "default" attribute, view labels and <br/>
+++ mime types for each hint (these are not know by the components
themselves).</p>
+++
+++ <p><fortress-migration><br/>
+++ AFAIU, Fortress allows defaults for a collection of components <br/>
+++ implementing the same role, but I don't know how we can handle the <br/>
+++ additional "label" and "mime-type", which are not handled by the <br/>
+++ component itself.</p>
+++
+++ <p>Can we imagine a "fake" selector that route calls to select() to the
<br/>
+++ manager and handle these additional information on its own?<br/>
+++ </fortress-migration></p>
+++
+++ <p>Building the processing tree<br/>
+++ ----------------------------</p>
+++
+++ <p>The second section in a language configuration, <nodes>, defines a
+++ <br/>
+++ ComponentSelector for ProcessingNodeBuilders. For each element <br/>
+++ encountered in the sitemap source file, the corresponding node builder <br/>
+++ is fetched from this selector with the local name of the element as the
<br/>
+++ selection hint, i.e. <map:act> will lead to
selector.select("act").</p>
+++
+++ <p>The contents of each <node> element is the specific Avalon
+++ configuration <br/>
+++ of the corresponding ProcessingNodeBuilder and mostly define the allowed
<br/>
+++ child statements.</p>
+++
+++ <p>Now a sitemap is not a tree, but a graph because of resources and views
+++ <br/>
+++ that can be called from any point in the sitemap. To handle this, <br/>
+++ building the processing tree follows two phases:<br/>
+++ - the whole node tree is built, and nodes that other nodes can link (or
<br/>
+++ jump) to are registered in the common TreeBuilder by their respective <br/>
+++ node builders (see TreeBuilder.registerNode()).<br/>
+++ - then then those node builders that implement <br/>
+++ LikedProcessingNodeBuilder are asked link their node, which they do by <br/>
+++ fetching the appropriate node registered in the first phase.</p>
+++
+++ <p>We then obtain an evaluation tree (in reality a graph) that is ready for
+++ <br/>
+++ use. All build-time related components are then released.</p>
+++
+++ <p>It is to be noted also, that a ProcessingNode is considered as a <br/>
+++ "non-managed component": with the help of the LifecycleHelper class, the
<br/>
+++ TreeBuilder honours any of the Avalon lifecycle interfaces that a node <br/>
+++ implements. This is required as many nodes require access to the <br/>
+++ component selectors defined by <map:components>. Disposable nodes are
+++ <br/>
+++ collected in a list that the TreeProcessor traverses when needed <br/>
+++ (sitemap change or system disposal).</p>
+++
+++ <p>Great care has been taken to cleanly separate build-time and run-time
<br/>
+++ code and data, to ensure the smallest memory occupation and the fastest
<br/>
+++ possible execution. This led this intepreted engine to be a bit faster <br/>
+++ at runtime than the compiled one (build time is more than 20 times
faster).</p>
+++
+++ <p><fortress-migration><br/>
+++ An optimisation that is done and may be relevant to migration to <br/>
+++ Fortress is that ThreadSafe components are looked up as part of the tree
<br/>
+++ building and never looked up again later (see e.g. MatchNode). AFAIU, <br/>
+++ lifestyle interface no more exist with Fortress, so this optimisation <br/>
+++ may be difficult to do, if not impossible.<br/>
+++ </fortress-migration></p>
+++
+++ <p>Building a pipeline<br/>
+++ -------------------</p>
+++
+++ <p>When a request has to be processed, the TreeProcessor calls invoke() on
+++ <br/>
+++ the root node of the evaluation tree. This method has two parameters: <br/>
+++ the environment defining the request, and an InvokeContext that mainly <br/>
+++ holds the pipeline that is being built and the stack of sitemap
variables.</p>
+++
+++ <p>The invoke method executes all processing nodes (depth first) until one
+++ <br/>
+++ them returns "true", meaning that a pipeline was successfully built. <br/>
+++ Examples of nodes that return true are serializers, readers and
redirect.</p>
+++
+++ <p>If the environment is external, the pipeline is executed as soon as it
<br/>
+++ is ended (i.e. in the reader or serializer node). But if the environment
<br/>
+++ is internal (i.e. a "cocoon:" source), it is not, meaning the pipeline <br/>
+++ is returned to the SitemapSource, ready for later execution if requested
<br/>
+++ so (e.g. by a Source.getInputStream()).</p>
+++
+++ <p>Phew... I finally explained the whole thing in depth. I'm no more the
<br/>
+++ only one to know ;-)<br/>
+++ I'll also put this into the wiki.</p>
+++
</body>
</html>
Fields
======
no changes
Links
=====
no changes
Custom Fields
=============
no changes
Collections
===========
no changes