stefano 02/02/23 15:28:15 Modified: src/documentation/xdocs book.xml index.xml license.xml tutorial.xml Added: src/documentation/xdocs index-old.xml introduction.xml src/documentation/xdocs/drafts newtoc.txt Removed: src/documentation/xdocs 3rdparty.xml emotional-landscapes.xml newtoc.txt patches.xml Log: refreshed docs a little: - simple index page - added 'download' on the side bar - refactored sidebar - removed 'Patch queue' - removed 'third-party' - added 'introduction' based on emotional-landscapes - and a few other things Revision Changes Path 1.5 +25 -24 xml-cocoon2/src/documentation/xdocs/book.xml Index: book.xml =================================================================== RCS file: /home/cvs/xml-cocoon2/src/documentation/xdocs/book.xml,v retrieving revision 1.4 retrieving revision 1.5 diff -u -r1.4 -r1.5 --- book.xml 19 Jan 2002 06:13:39 -0000 1.4 +++ book.xml 23 Feb 2002 23:28:15 -0000 1.5 @@ -3,40 +3,41 @@ <book software="Apache Cocoon" title="Apache Cocoon Documentation" - copyright="@year@ The Apache Software Foundation" + copyright="1999-2002 The Apache Software Foundation" xmlns:xlink="http://www.w3.org/1999/xlink"> <menu label="About"> <menu-item label="Index" href="index.html"/> <menu-item label="License" href="license.html"/> + <external label="Download" href="http://xml.apache.org/cocoon/dist/"/> </menu> - <menu label="Cocoon"> + <menu label="Documentation"> + <menu-item label="Introduction" href="introduction.html"/> <menu-item label="Installing" href="installing/index.html"/> - <menu-item label="Overview" href="overview.html"/> + <menu-item label="User Guide" href="userdocs/index.html"/> + <menu-item label="Dev Guide" href="developing/index.html"/> <menu-item label="Tutorial" href="tutorial.html"/> - <menu-item label="cTwIG" href="ctwig/index.html"/> - <menu-item label="Users" href="userdocs/index.html"/> - <menu-item label="Developers" href="developing/index.html"/> - <menu-item label="List of Docs" href="doclist.html"/> + <menu-item label="FAQs" href="faq.html"/> + <menu-item label="ToC" href="doclist.html"/> </menu> - <menu label="Links"> - <external label="XML Links" href="http://dmoz.org/Computers/Data_Formats/Markup_Languages/XML/"/> + <menu label="Status"> + <menu-item label="Changes" href="changes.html"/> + <menu-item label="Todo" href="todo.html"/> </menu> - <menu label="Infos"> - <menu-item label="Who we are" href="who.html"/> + <menu label="Community"> + <menu-item label="Hall of Fame" href="who.html"/> <menu-item label="Contributing" href="contrib.html"/> - <menu-item label="3rd Party" href="3rdparty.html"/> - <menu-item label="Patch Queue" href="patches.html"/> + <menu-item label="Mail Lists" href="mail-lists.html"/> + <menu-item label="Mail Archives" href="mail-archives.html"/> </menu> - <menu label="Status"> - <menu-item label="FAQ File" href="faq.html"/> - <menu-item label="Changes" href="changes.html"/> - <menu-item label="Todo" href="todo.html"/> -<!-- <menu-item label="Planning notes" href="plan/index.html"/> --> + <menu label="Project"> + <external label="Bug Database" href="http://nagoya.apache.org/bugzilla/index.html"/> + <external label="Code Repository" href="http://cvs.apache.org/viewcvs.cgi/xml-cocoon2/"/> + <external label="Dev Snapshots" href="http://xml.apache.org/from-cvs/xml-cocoon2/"/> </menu> <menu label="Hosting"> @@ -44,12 +45,12 @@ <menu-item label="Cocoon Hosting" href="hosting.html"/> </menu> - <menu label="Project"> - <external label="Bug Database" href="http://nagoya.apache.org/bugzilla/index.html"/> - <external label="Code Repository" href="http://cvs.apache.org/viewcvs.cgi/xml-cocoon2/"/> - <external label="Dev Snapshots" href="http://xml.apache.org/from-cvs/xml-cocoon2/"/> - <menu-item label="Mail Lists" href="mail-lists.html"/> - <menu-item label="Mail Archives" href="mail-archives.html"/> + <menu label="Links"> + <external label="XML Links" href="http://dmoz.org/Computers/Data_Formats/Markup_Languages/XML/"/> + </menu> + + <menu label="Cocoon 1.x"> + <external label="Old Generation" href="http://xml.apache.org/cocoon1/"/> </menu> </book> 1.6 +26 -320 xml-cocoon2/src/documentation/xdocs/index.xml Index: index.xml =================================================================== RCS file: /home/cvs/xml-cocoon2/src/documentation/xdocs/index.xml,v retrieving revision 1.5 retrieving revision 1.6 diff -u -r1.5 -r1.6 --- index.xml 18 Feb 2002 09:33:46 -0000 1.5 +++ index.xml 23 Feb 2002 23:28:15 -0000 1.6 @@ -1,20 +1,17 @@ <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.0//EN" "dtd/document-v10.dtd"> - <document> - <header> - <title>Apache Cocoon</title> - <subtitle>XML Publishing Framework</subtitle> - <authors> - <person name="Stefano Mazzocchi" email="[EMAIL PROTECTED]"/> - </authors> - </header> - - <body> - <s1 title="What is it?"> - <figure src="images/cocoon2.gif" alt="The new Cocoon Logo" - width="435" height="39"/> - <p> + <header> + <title>Apache Cocoon</title> + <subtitle>XML Publishing Framework</subtitle> + <authors> + <person name="Stefano Mazzocchi" email="[EMAIL PROTECTED]"/> + </authors> + </header> + <body> + <figure src="images/cocoon.gif" alt="Cocoon"/> + <s1 title="What is this?"> + <p> Apache Cocoon is an XML publishing framework that raises the usage of XML and XSLT technologies for server applications to a new level. Designed for performance and scalability around pipelined SAX @@ -22,8 +19,8 @@ of concerns between content, logic and style. A centralized configuration system and sophisticated caching top this all off and help you to create, deploy and maintain rock-solid XML server applications. - </p> - <p> + </p> + <p> Cocoon interacts with most data sources, including: filesystems, RDBMS, LDAP, native XML databases, and network-based data sources. It adapts content delivery to the capabilities of different devices like HTML, WML, @@ -31,309 +28,18 @@ from a powerful commandline interface. The chosen design of an abstracted environment gives you the freedom to implement your own concrete environment to suit your required functionality. - </p> - <p> - This documentation is not complete because documentation is never complete anyway. However the current - release is stable and tested thoroughly and you'll find lots of samples which show and explain - the power of Apache Cocoon @version@. We welcome you into this new world of XML wonders :-) - </p> - <p> - Technologies like Extensible Server Pages (XSP) and the Action framework gives - you all the power to add your own logic into the building process of your resources - and services you want Cocoon to be able to perform. - </p> - </s1> - <s1 title="Where is it?"> - <p> + </p> + </s1> + <s1 title="Where is it?"> + <p> If you want to download the latest release of Apache Cocoon just go to the - <link href="http://xml.apache.org/dist/cocoon">download area.</link> - </p> - <p> - Apache Cocoon @version@ is the latest release of the XML publishing framework. - If you are looking for Cocoon 1 go <link href="http://xml.apache.org/cocoon1/">here</link>. - </p> - </s1> - <s1 title="Introduction"> - <p>The Cocoon Project has gone a long way since its creation on - January 1999. It started as a simple servlet for static XSL styling and became - more and more powerful as new features were added. Unfortunately, design - decisions made early in the project influenced its evolution. Today, some of - those constraints that shaped the project were modified as XML standards have evolved and - solidified. For this reason, those design decisions need to be reconsidered - under this new light.</p> - - <p>While Cocoon started as a small step in the direction of a new - web publishing idea based on better design patterns and reviewed estimations - of management issues, the technology used was not mature enough for tools to - emerge. Today, most web engineers consider XML as the key for an improved web - model and web site managers see XML as a way to reduce costs and ease - production.</p> - - <p>In an era where services rather than software will be key for - economic success, a better and less expensive model for web publishing will - be a winner, especially if based on open standards.</p> - </s1> - - <s1 title="Passive APIs vs. Active APIs"> - <p>Web serving environments must be fast and scalable to be - useful. Cocoon 1 was born as a "proof of concept" rather than - production software and had significant design restrictions, based mainly on - the availability of freely redistributable tools. Other issues were lack of - detailed knowledge on the APIs available as well as underestimation of the - project success, being created as a way to learn XSL rather than a full - publishing system capable of taking care of all XML web publishing needs.</p> - - <p>For the above reasons, Cocoon 1 was based on the DOM level 1 - API which is a <em>passive</em> API and was intended mainly for client side - operation. This is mainly due to the fact that most DOM - implementations require the document to reside in memory. While this is - practical for small documents and thus good for the "proof of - concept" stage, it is now considered a main design constraint for Cocoon - scalability.</p> - - <p>Since the goal of Cocoon is the ability to process - simultaneously multiple 100Mb documents in JVM with a few Mbs of heap size, - careful memory use and tuning of internal components is a key issue. To reach - this goal, an improved API model was needed. This is now identified in the SAX - API which is, unlike DOM, event based (so <em>active</em>, in the sense that its - design is based on the <em>inversion of control</em> principle).</p> - - <p>The event model allows document generators to trigger events that get handled - by the various processing stages and finally get - serialized onto the response stream. This has a significant impact on both - performance (effective and user perceived) and memory needs:</p> - - <dl> - <dt>Incremental operation</dt> - <dd> - The response is created during document production. - Client's perceived performance is dramatically - improved since clients can start receiving data as soon as it is created, - not after all processing stages have been performed. In those cases where - incremental operation is not possible (for example, element sorting), - internal buffers store the events until the operation can be performed. - However, even in these cases performance can be increased with the use of - tuned memory structures. - </dd> - <dt>Lowered memory consumption</dt> - <dd> - Since most of the - server processing required in Cocoon is incremental, an incremental model - allows XML production events to be transformed directly into output events - and character written on streams, thus avoiding the need to store them in - memory. - </dd> - <dt>Easier scalability</dt> - <dd> - Reduced memory needs allow a greater number of - concurrent operations to take place simultaneously, thus allowing the - publishing system to scale as the load increases. - </dd> - <dt>More optimizable code model</dt> - <dd> - Modern virtual machines are based on the idea of <em>hotspots</em>, - code fragments that are used often and, if optimized, increase the process - execution speed by large amounts. - This new event model allows easier detection of hotspots since it is a - method driven operation, rather than a memory driven one. Hot methods can - be identified earlier and can be better optimized. - </dd> - <dt>Reduced garbage collection</dt> - <dd> - Even the most advanced - and lightweight DOM implementation require at least three to five times - (and sometimes much more than this) more memory than the original document - size. This not only reduces the scalability of the operation, but also - impacts overall performance by increasing the amount of memory garbage that - must be collected, tying up CPU cycles. Even if modern - virtual machines have reduced the overhead of garbage collection, - less garbage will always benefit performance and scalability. - </dd> - </dl> - - <p>The above points alone would be enough for the Cocoon - paradigm shift, even if this event based model impacts not only the general - architecture of the publishing system but also its internal processing - components such as XSLT processing and PDF formatting. These components will - require substantial work and maybe design reconsideration to be able to follow - a pure event-based model. The Cocoon Project will work closely with the other - component projects to be able to influence their operation in this direction.</p> -</s1> - -<s1 title="Reactors Reconsidered"> - <p>Another design choice that should be revised is the reactor - pattern that was introduced to allow components to be connected in more - flexible way. In fact, by contrast to the fixed pipe model used up to Cocoon - 1.3.1, the reactor approach allows components to be dynamically connected, - depending on reaction instructions introduced inside the documents.</p> - - <p>While this at first seemed a very advanced and highly - appealing model, it turned out to be a very dangerous approach. The first - concern is mainly technical: porting the reactor pattern under an event-based - model requires limitations and tradeoffs since the generated events must be - cached until a reaction instruction is encountered.</p> - - <p>But even if the technical difficulties could be solved, a key limitation - remains: there is no single point of management.</p> -</s1> - -<s1 title="Management Considerations"> - <p>The web was created to reduce information management costs by - distributing them back on information owners. While this model is great for - user communities (scientists, students, employees, or people in general) each - of them managing small amount of personal information, it becomes impractical - for highly centralized information systems where <em>distributed management</em> - is simply not practical.</p> - - <p>While in the HTML web model the page format and URL names - were the only necessary contracts between individuals to create a world wide - web, in more structured information systems the number of contracts increases - by a significant factor due to the need of coherence between the - hosted information: common style, common design issues, common languages, - server side logic integration, data validation, etc...</p> - - <p>It is only under this light that XML and its web model reveal - their power: the HTML web model had too little in the way of contracts to be - able to develop a structured and more coherent distributed information system, - a reason that is mainly imposed by the lack of good and algorithmically certain - information indexing and knowledge seeking systems. Lacks that tend to degrade - the quality of the truly distributed web in favor of more structured web sites - (that based their improved site structure on internal contracts).</p> - - <p>The simplification and engineering of web site management is - considered one of the most important Cocoon goals. This is done mainly by - technologically imposing a reduced number of contracts and placing them in a - hierarchical shape, suitable for replacing current high-structure web site - management models.</p> - - <p>The model that Cocoon adopts is the "pyramid model of - web contracts" which is outlined in the picture below</p> - - <figure src="images/pyramid-model.gif" - alt="The Cocoon Pyramid Model of Contracts" - width="313" height="159"/> - - <p>and is composed by four different working contexts (the rectangles)</p> - - <dl> - <dt>Management</dt> - <dd> - The people that decide what the site should - contain, how it should behave and how it should appear - </dd> - <dt>Content</dt> - <dd> - The people responsible for writing, owning and managing - the site content. This context may contain several sub-contexts - - one for each language used to express page content. - </dd> - <dt>Logic</dt> - <dd> - The people responsible for integration with dynamic - content generation technologies and database systems. - </dd> - <dt>Style</dt> - <dd> - The people responsible for information - presentation, look & feel, site graphics and its maintenance. - </dd> - </dl> - - <p>and five contracts (the lines)</p> - - <ul> - <li>management - content</li> - <li>management - logic</li> - <li>management - style</li> - <li>content - logic</li> - <li>content - style</li> - </ul> - - <p>Note that there is no <em>logic - style</em> contract. Cocoon aims to - provide both software and guidelines to allow you to remove such a - contract.</p> -</s1> - -<s1 title="Overlapping contexts and Chain Mapping"> - <p>The above model can be applied only if the different contexts - never overlap, otherwise there is no chance of having a single management - point. For example, if the W3C-recommended method to link stylesheets to XML - documents is used, the content and style contexts overlap and it's impossible - to change the styling behavior of the document without changing it. The same - is true for the processing instructions used by the Cocoon 1 reactor to drive - the page processing: each stage specifies the next stage to determine the result, - thus increasing management and debugging complexity. Another overlapping in - context contracts is the need for URL-encoded parameters to drive the page output. - These overlaps break the pyramid model and increase the management costs.</p> - - <p>Starting with Version 2.0, the reactor pattern has been abandoned in favor of - a pipeline mapping technique. This is based on the fact that the number of - different contracts is limited even for big sites and grows with a rate - that is normally much less than its size.</p> - - <p>Also, for performance reasons, Cocoon tries to compile - everything that is possibly compilable (pages/XSP into generators, stylesheets - into transformers, etc...) so, in this new model, the <em>processing chain</em> - that generates the page contains (in a direct executable form) all the - information/logic that handles the requested resource to generate its - response.</p> - - <p>This means that instead of using event-driven request-time DTD interpretation - (done in all Cocoon 1 processors), these are compiled into transformers - directly (XSLT stylesheet compilation) or compiled into generators using - logicsheets and XSP which will remove totally the need for request-time - interpretation solutions like DCP that has been removed.</p> - - <note>Some of these features were already present in latest Cocoon 1.x - releases but now the Cocoon architecture makes them central to its new - core.</note> -</s1> - -<s1 title="Sitemap"> - <p>In Cocoon terminology, a <em>sitemap</em> is the collection of pipeline - matching informations that allow the Cocoon engine to associate the requested - URI to the proper response-producing pipeline.</p> - - <p>The sitemap physically represents the central repository for web site - administration, where the URI space and its handling is maintained.</p> - - <p>Please, take a look at the <link href="userdocs/concepts/sitemap.html">sitemap documentation</link> - for more information on this.</p> - -</s1> -<s1 title="Caching"> - <p>The cache system of Cocoon has a very flexible and powerful design. - The algorithms and components used are not hard-wired to the core - of Cocoon. Instead they are dynamically configurable.</p> - <p>The cache system automatically checks for valid cached content and - delivers the valid content directly from the cache without any - pipeline processing.</p> - <p>The issue regarding static file caching that, no matter what, will - always be slower than direct web server caching, means that Cocoon tries - to be as <em>proxy friendly</em> as possible.</p> - <p>To be able to put most of the static part of the job back on the web - server (where it belongs), Cocoon provides a command line - operation, allowing the creation of <em>site makefiles</em> that will - automatically scan the web site and the source documents and will provide a - way to <em>regenerate</em> the static part of a web site (images and tables - included!) based on the same XML model used in the dynamic operation version.</p> - -<!-- Needs rewriting - <p>Cocoon will, in fact, be the integration between Cocoon 1 and Stylebook.</p> - - <p>It will be up to the web server administrator to use static - regeneration capabilities on a time basis, manually or triggered by some - particular event (e.g. database update signal) since Cocoon will only provide - servlet and command line capabilities. The nice integration is based on the - fact that there will be no behavioral difference if the files are dynamically - generated in Cocoon via the servlet operation and cached internally or - pre-generated and served directly by the web server, as long as URI contracts - are kept the same by the system administrator (via URL-rewriting or aliasing)</p> - - <p>Also, it will be possible to avoid on-the-fly page and stylesheet - compilation (which makes debugging harder) with command line pre-compilation - hooks that will work like normal compilers from a developer's point of view.</p> ---> -</s1> - </body> + <link href="http://xml.apache.org/cocoon/dist">download area.</link> + </p> + <p> + If you are looking for the past generation of Cocoon (not supported anymore but still available) + go to the <link href="http://xml.apache.org/cocoon1/">Cocoon 1.x area</link>. + </p> + </s1> + <figure src="images/cocoon-built.gif" alt="Built with Apache Cocoon"/> + </body> </document> 1.3 +2 -4 xml-cocoon2/src/documentation/xdocs/license.xml Index: license.xml =================================================================== RCS file: /home/cvs/xml-cocoon2/src/documentation/xdocs/license.xml,v retrieving revision 1.2 retrieving revision 1.3 diff -u -r1.2 -r1.3 --- license.xml 3 Feb 2002 00:35:42 -0000 1.2 +++ license.xml 23 Feb 2002 23:28:15 -0000 1.3 @@ -16,7 +16,7 @@ The Apache Software License, Version 1.1 ============================================================================ - Copyright (C) @year@ The Apache Software Foundation. All rights reserved. + Copyright (C) 1999-2002 The Apache Software Foundation. All rights reserved. Redistribution and use in source and binary forms, with or without modifica- tion, are permitted provided that the following conditions are met: @@ -57,9 +57,7 @@ This software consists of voluntary contributions made by many individuals on behalf of the Apache Software Foundation and was originally created by Stefano Mazzocchi <[EMAIL PROTECTED]>. For more information on the Apache - Software Foundation, please see <http://www.apache.org/>. - -]]></source> + Software Foundation, please see <http://www.apache.org/>.]]></source> <p>There are also licenses for additional products that are distributed with Apache Cocoon. Please find those documents in the <code>legal/</code> 1.3 +2 -6 xml-cocoon2/src/documentation/xdocs/tutorial.xml Index: tutorial.xml =================================================================== RCS file: /home/cvs/xml-cocoon2/src/documentation/xdocs/tutorial.xml,v retrieving revision 1.2 retrieving revision 1.3 diff -u -r1.2 -r1.3 --- tutorial.xml 15 Feb 2002 10:27:33 -0000 1.2 +++ tutorial.xml 23 Feb 2002 23:28:15 -0000 1.3 @@ -122,9 +122,7 @@ employees, but each employee can only have one department. We will be able to create, change, and delete both employees and departments.</p> <s3 title="The SQL"> -<source> - <![CDATA[ -CREATE TABLE department { +<source><![CDATA[CREATE TABLE department { department_id INT NOT NULL, department_name VARCHAR (64) NOT NULL }; @@ -142,9 +140,7 @@ PRIMARY KEY pkEmployee (employee_id); ALTER TABLE employee ADD - FOREIGN KEY department_id (department.department_id); - ]]> -</source> + FOREIGN KEY department_id (department.department_id);]]></source> </s3> <s3 title="Facilities"> <ol> 1.1 xml-cocoon2/src/documentation/xdocs/index-old.xml Index: index-old.xml =================================================================== <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.0//EN" "dtd/document-v10.dtd"> <document> <header> <title>Apache Cocoon</title> <subtitle>XML Publishing Framework</subtitle> <authors> <person name="Stefano Mazzocchi" email="[EMAIL PROTECTED]"/> </authors> </header> <body> <s1 title="What is it?"> <figure src="images/cocoon.gif" alt="Cocoon"/> <p> Apache Cocoon is an XML publishing framework that raises the usage of XML and XSLT technologies for server applications to a new level. Designed for performance and scalability around pipelined SAX processing, Cocoon offers a flexible environment based on the separation of concerns between content, logic and style. A centralized configuration system and sophisticated caching top this all off and help you to create, deploy and maintain rock-solid XML server applications. </p> <p> Cocoon interacts with most data sources, including: filesystems, RDBMS, LDAP, native XML databases, and network-based data sources. It adapts content delivery to the capabilities of different devices like HTML, WML, PDF, SVG, RTF just to name a few. Cocoon currently runs as a Servlet or from a powerful commandline interface. The chosen design of an abstracted environment gives you the freedom to implement your own concrete environment to suit your required functionality. </p> <p> This documentation is not complete because documentation is never complete anyway. However the current release is stable and tested thoroughly and you'll find lots of samples which show and explain the power of Apache Cocoon @version@. We welcome you into this new world of XML wonders :-) </p> <p> Technologies like Extensible Server Pages (XSP) and the Action framework gives you all the power to add your own logic into the building process of your resources and services you want Cocoon to be able to perform. </p> </s1> <s1 title="Where is it?"> <p> If you want to download the latest release of Apache Cocoon just go to the <link href="http://xml.apache.org/dist/cocoon">download area.</link> </p> <p> Apache Cocoon @version@ is the latest release of the XML publishing framework. If you are looking for Cocoon 1 go <link href="http://xml.apache.org/cocoon1/">here</link>. </p> </s1> <s1 title="Introduction"> <p>The Cocoon Project has gone a long way since its creation on January 1999. It started as a simple servlet for static XSL styling and became more and more powerful as new features were added. Unfortunately, design decisions made early in the project influenced its evolution. Today, some of those constraints that shaped the project were modified as XML standards have evolved and solidified. For this reason, those design decisions need to be reconsidered under this new light.</p> <p>While Cocoon started as a small step in the direction of a new web publishing idea based on better design patterns and reviewed estimations of management issues, the technology used was not mature enough for tools to emerge. Today, most web engineers consider XML as the key for an improved web model and web site managers see XML as a way to reduce costs and ease production.</p> <p>In an era where services rather than software will be key for economic success, a better and less expensive model for web publishing will be a winner, especially if based on open standards.</p> </s1> <s1 title="Passive APIs vs. Active APIs"> <p>Web serving environments must be fast and scalable to be useful. Cocoon 1 was born as a "proof of concept" rather than production software and had significant design restrictions, based mainly on the availability of freely redistributable tools. Other issues were lack of detailed knowledge on the APIs available as well as underestimation of the project success, being created as a way to learn XSL rather than a full publishing system capable of taking care of all XML web publishing needs.</p> <p>For the above reasons, Cocoon 1 was based on the DOM level 1 API which is a <em>passive</em> API and was intended mainly for client side operation. This is mainly due to the fact that most DOM implementations require the document to reside in memory. While this is practical for small documents and thus good for the "proof of concept" stage, it is now considered a main design constraint for Cocoon scalability.</p> <p>Since the goal of Cocoon is the ability to process simultaneously multiple 100Mb documents in JVM with a few Mbs of heap size, careful memory use and tuning of internal components is a key issue. To reach this goal, an improved API model was needed. This is now identified in the SAX API which is, unlike DOM, event based (so <em>active</em>, in the sense that its design is based on the <em>inversion of control</em> principle).</p> <p>The event model allows document generators to trigger events that get handled by the various processing stages and finally get serialized onto the response stream. This has a significant impact on both performance (effective and user perceived) and memory needs:</p> <dl> <dt>Incremental operation</dt> <dd> The response is created during document production. Client's perceived performance is dramatically improved since clients can start receiving data as soon as it is created, not after all processing stages have been performed. In those cases where incremental operation is not possible (for example, element sorting), internal buffers store the events until the operation can be performed. However, even in these cases performance can be increased with the use of tuned memory structures. </dd> <dt>Lowered memory consumption</dt> <dd> Since most of the server processing required in Cocoon is incremental, an incremental model allows XML production events to be transformed directly into output events and character written on streams, thus avoiding the need to store them in memory. </dd> <dt>Easier scalability</dt> <dd> Reduced memory needs allow a greater number of concurrent operations to take place simultaneously, thus allowing the publishing system to scale as the load increases. </dd> <dt>More optimizable code model</dt> <dd> Modern virtual machines are based on the idea of <em>hotspots</em>, code fragments that are used often and, if optimized, increase the process execution speed by large amounts. This new event model allows easier detection of hotspots since it is a method driven operation, rather than a memory driven one. Hot methods can be identified earlier and can be better optimized. </dd> <dt>Reduced garbage collection</dt> <dd> Even the most advanced and lightweight DOM implementation require at least three to five times (and sometimes much more than this) more memory than the original document size. This not only reduces the scalability of the operation, but also impacts overall performance by increasing the amount of memory garbage that must be collected, tying up CPU cycles. Even if modern virtual machines have reduced the overhead of garbage collection, less garbage will always benefit performance and scalability. </dd> </dl> <p>The above points alone would be enough for the Cocoon paradigm shift, even if this event based model impacts not only the general architecture of the publishing system but also its internal processing components such as XSLT processing and PDF formatting. These components will require substantial work and maybe design reconsideration to be able to follow a pure event-based model. The Cocoon Project will work closely with the other component projects to be able to influence their operation in this direction.</p> </s1> <s1 title="Reactors Reconsidered"> <p>Another design choice that should be revised is the reactor pattern that was introduced to allow components to be connected in more flexible way. In fact, by contrast to the fixed pipe model used up to Cocoon 1.3.1, the reactor approach allows components to be dynamically connected, depending on reaction instructions introduced inside the documents.</p> <p>While this at first seemed a very advanced and highly appealing model, it turned out to be a very dangerous approach. The first concern is mainly technical: porting the reactor pattern under an event-based model requires limitations and tradeoffs since the generated events must be cached until a reaction instruction is encountered.</p> <p>But even if the technical difficulties could be solved, a key limitation remains: there is no single point of management.</p> </s1> <s1 title="Management Considerations"> <p>The web was created to reduce information management costs by distributing them back on information owners. While this model is great for user communities (scientists, students, employees, or people in general) each of them managing small amount of personal information, it becomes impractical for highly centralized information systems where <em>distributed management</em> is simply not practical.</p> <p>While in the HTML web model the page format and URL names were the only necessary contracts between individuals to create a world wide web, in more structured information systems the number of contracts increases by a significant factor due to the need of coherence between the hosted information: common style, common design issues, common languages, server side logic integration, data validation, etc...</p> <p>It is only under this light that XML and its web model reveal their power: the HTML web model had too little in the way of contracts to be able to develop a structured and more coherent distributed information system, a reason that is mainly imposed by the lack of good and algorithmically certain information indexing and knowledge seeking systems. Lacks that tend to degrade the quality of the truly distributed web in favor of more structured web sites (that based their improved site structure on internal contracts).</p> <p>The simplification and engineering of web site management is considered one of the most important Cocoon goals. This is done mainly by technologically imposing a reduced number of contracts and placing them in a hierarchical shape, suitable for replacing current high-structure web site management models.</p> <p>The model that Cocoon adopts is the "pyramid model of web contracts" which is outlined in the picture below</p> <figure src="images/pyramid-model.gif" alt="The Cocoon Pyramid Model of Contracts" width="313" height="159"/> <p>and is composed by four different working contexts (the rectangles)</p> <dl> <dt>Management</dt> <dd> The people that decide what the site should contain, how it should behave and how it should appear </dd> <dt>Content</dt> <dd> The people responsible for writing, owning and managing the site content. This context may contain several sub-contexts - one for each language used to express page content. </dd> <dt>Logic</dt> <dd> The people responsible for integration with dynamic content generation technologies and database systems. </dd> <dt>Style</dt> <dd> The people responsible for information presentation, look & feel, site graphics and its maintenance. </dd> </dl> <p>and five contracts (the lines)</p> <ul> <li>management - content</li> <li>management - logic</li> <li>management - style</li> <li>content - logic</li> <li>content - style</li> </ul> <p>Note that there is no <em>logic - style</em> contract. Cocoon aims to provide both software and guidelines to allow you to remove such a contract.</p> </s1> <s1 title="Overlapping contexts and Chain Mapping"> <p>The above model can be applied only if the different contexts never overlap, otherwise there is no chance of having a single management point. For example, if the W3C-recommended method to link stylesheets to XML documents is used, the content and style contexts overlap and it's impossible to change the styling behavior of the document without changing it. The same is true for the processing instructions used by the Cocoon 1 reactor to drive the page processing: each stage specifies the next stage to determine the result, thus increasing management and debugging complexity. Another overlapping in context contracts is the need for URL-encoded parameters to drive the page output. These overlaps break the pyramid model and increase the management costs.</p> <p>Starting with Version 2.0, the reactor pattern has been abandoned in favor of a pipeline mapping technique. This is based on the fact that the number of different contracts is limited even for big sites and grows with a rate that is normally much less than its size.</p> <p>Also, for performance reasons, Cocoon tries to compile everything that is possibly compilable (pages/XSP into generators, stylesheets into transformers, etc...) so, in this new model, the <em>processing chain</em> that generates the page contains (in a direct executable form) all the information/logic that handles the requested resource to generate its response.</p> <p>This means that instead of using event-driven request-time DTD interpretation (done in all Cocoon 1 processors), these are compiled into transformers directly (XSLT stylesheet compilation) or compiled into generators using logicsheets and XSP which will remove totally the need for request-time interpretation solutions like DCP that has been removed.</p> <note>Some of these features were already present in latest Cocoon 1.x releases but now the Cocoon architecture makes them central to its new core.</note> </s1> <s1 title="Sitemap"> <p>In Cocoon terminology, a <em>sitemap</em> is the collection of pipeline matching informations that allow the Cocoon engine to associate the requested URI to the proper response-producing pipeline.</p> <p>The sitemap physically represents the central repository for web site administration, where the URI space and its handling is maintained.</p> <p>Please, take a look at the <link href="userdocs/concepts/sitemap.html">sitemap documentation</link> for more information on this.</p> </s1> <s1 title="Caching"> <p>The cache system of Cocoon has a very flexible and powerful design. The algorithms and components used are not hard-wired to the core of Cocoon. Instead they are dynamically configurable.</p> <p>The cache system automatically checks for valid cached content and delivers the valid content directly from the cache without any pipeline processing.</p> <p>The issue regarding static file caching that, no matter what, will always be slower than direct web server caching, means that Cocoon tries to be as <em>proxy friendly</em> as possible.</p> <p>To be able to put most of the static part of the job back on the web server (where it belongs), Cocoon provides a command line operation, allowing the creation of <em>site makefiles</em> that will automatically scan the web site and the source documents and will provide a way to <em>regenerate</em> the static part of a web site (images and tables included!) based on the same XML model used in the dynamic operation version.</p> <!-- Needs rewriting <p>Cocoon will, in fact, be the integration between Cocoon 1 and Stylebook.</p> <p>It will be up to the web server administrator to use static regeneration capabilities on a time basis, manually or triggered by some particular event (e.g. database update signal) since Cocoon will only provide servlet and command line capabilities. The nice integration is based on the fact that there will be no behavioral difference if the files are dynamically generated in Cocoon via the servlet operation and cached internally or pre-generated and served directly by the web server, as long as URI contracts are kept the same by the system administrator (via URL-rewriting or aliasing)</p> <p>Also, it will be possible to avoid on-the-fly page and stylesheet compilation (which makes debugging harder) with command line pre-compilation hooks that will work like normal compilers from a developer's point of view.</p> --> </s1> </body> </document> 1.1 xml-cocoon2/src/documentation/xdocs/introduction.xml Index: introduction.xml =================================================================== <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.0//EN" "dtd/document-v10.dtd"> <document> <header> <title>Introducing Cocoon</title> <authors> <person name="Stefano Mazzocchi" email="[EMAIL PROTECTED]"/> </authors> </header> <body> <s1 title="The XML Hype"> <p> Everybody talks about XML. XML here, XML there. All application servers support XML, everybody wants to do B2B using XML, web services using XML, even databases using XML. </p> <p> Should you care about it? Given the amount of hype, you can't afford to go around ignoring the argument, would be like ignoring the world wide web 10 years ago: a clear mistake. But why is this so for XML? What is this "magic" that XML seems to have to solve my problems? Isn't this another hype to change once again the IT infrastructure that you spent so much time implementing and fixing in the last few years? Isn't another way to spill money out of your pockets? </p> <p> If you ever asked yourself one of the above questions, this paper is for you. You won't find singing-and-dancing marketing crap, you won't find boring and useless feature lists, you won't find the usual acronym bombing or those good looking vaporware schemas that connect your databases to your coffee machines via CORBA or stuff like that. </p> <p> This document will explain you what the Cocoon project is about and what we are doing to solve the problems that we encountered in our web engineering experiences, but from an executive perspective, yes, because we all had the problems of managing a web site, dealing with our colleagues, rushing to the graphical guru to have the little GIF with the new title, or calling the web administrator at night because the database is returning errors without reasons. </p> <p> It was frustrating to see the best and most clever information technology ever invented (the web) ruined by the lack of engineering practices, tortured by those "let's-reinvent-the-wheel-once-again" craftmen that were great at doing their jobs as individuals but that couldn't scale and imposed a growth saturation to the whole project. </p> <p> There had to be a better way of doing things. </p> </s1> <s1 title="Personal Experiences"> <p> In 1998, Stefano Mazzocchi volunteered to create the documentation infrastructure for the java.apache.org project, which is composed by a bunch of different codebases, maintained by a bunch of different people, with different skills, different geographical locations and different degree of will and time to dedicate to the documentation effort. </p> <p> But pretty soon he realized that no matter how great and well designed the system was, HTML was a problem: it was *not* designed for those kind of things. Looking at the main page (<link href="http://java.apache.org/">http://java.apache.org/</link>) from the browser and you could clearly identify the areas of the screen: sidebar, topbar, news, status. But if you opened the HTML, boom: a nightmare or table tags and nesting and small little tricks to make the HTML appear the same on every browser. </p> <p> So he looked around for alternative technologies, but *all* of them were trying to add more complexity at the GUI level (Microsoft Frontpage, Macromedia Dreamweaver, Adobe GoLive, etc...) hoping to "hide" the design problems of HTML under a thick layer of WYSIWYG looks. </p> <p> What you see is what you get. </p> <p> But what you see is all you've got. </p> <p> How can you tell your web server to "extract" the information from the sitebar? How can you have the news feeds out of a complex HTML page? </p> <p> Damn, it's easy for a human reader: just look at the page and it's very easy to distinguish between a sidebar, a banner, a news and a stock quote. Why is it so hard for a machine? </p> </s1> <s1 title="The HTML Model"> <p> HTML is a language that tells your browser how to "draw" things on its window. An image here, a letter there, a color down here. Nothing more. The browser doesn't have the "higher level" notion of "sidebar": it lacks the ability to perform "semantic analysis" on the HTML content. </p> <p> Semantic analysis? Yeah, it's the kind of thing the human brain is simply great at doing, while computer programs simply suck big time. </p> <p> So, with HTML, we went a step up and created a highly visual and appealing web of HTML content, but we went two steps back by removing all the higher level semantic information from the content itself. </p> <p> Ok, let's make an example... most of you have seen an HTML page... if not, here is an example: </p> <source><![CDATA[ <html> <body> <p>Hi, I'm an HTML page</p> <p align="center">Written by Stefano</p> </body> </html> ]]></source> <p> which says to the browser: </p> <ul> <li>I'm a HTML page</li> <li>I have a body</li> <li>I have a paragraph</li> <li>I contain the sentence "Hi, I'm an HTML page."</li> <li>I contain the sentence "Written by Stefano"</li> </ul> <p> Suppose you are a chinese guy that doesn't understand our alphabet, try to answer the following question: </p> <p> who wrote the page? </p> <p> You can't perform semantic analysis, you are as blind as a web browser. The only thing you can do is draw it on the screen since this is what you were programmed to do. In other words, your semantic capacity is fixed to the drawing capabilities and a few other things (like linking), thus limited. </p> </s1> <s1 title="Semantic Markup"> <p> Suppose you receive this page: </p> <source><![CDATA[ <page> <author>sflkjoiuer</author> <content> <para>sofikdjflksj</para> </content> </page> ]]></source> <p> can you tell me who wrote the page? easy, you say, "sflkjoiuer" did. Good, but later you receive: </p> <source><![CDATA[ <dlkj> <ruijfl>sofikdjflksj</ruijfl> <wijlkjf> <oamkfkj>sflkjoiuer</oamkfkj> </wijlkjf> </dlkj> ]]></source> <p> now, who wrote the page? You could guess by comparing the structure, but how do you know the two structures reflect the same semantic information? </p> <p> The above two pages are both XML documents. </p> <p> Are they going to help you? Are they doing to simplify your work? Are they going to simplify your problems? </p> <p> At this point, clearly not so, rather the opposite. </p> <p> So, you could be wondering, why did we spend so much effort to write an XML publishing framework? This document was written exactly to tell clear your doubts on this, so let's keep going. </p> </s1> <s1 title="The XML Language"> <p> XML is most of the times referred to as the "eXtensible Markup Language" specification. A fairly small yet complex specification that indicates how to write languages. It's a syntax. To tell you the truth, nothing fancy at all. So </p> <source><![CDATA[ <hello></hello> ]]></source> <p> is correct, while </p> <source><![CDATA[ <hello></hi> ]]></source> <p> is not, but </p> <source><![CDATA[ <hello><hi/></hello> ]]></source> <p> is correct. That's more than this, but I'll skip the technical details here. </p> <p> XML is the ASCII for the new millenium, it's a step forward from ASCII or UNICODE (the international extension to ASCII that includes all characters from all modern languages). It defines a "lingua franca" for textual languages. </p> <p> Ok, great, so now instead of having one uniform language with visual semantics (HTML) we have a babel of languages each with its own semantics. How this can possibly help you? </p> </s1> <s1 title="XML Transformations"> <p> This was the point where Stefano was more or less two years ago for java.apache.org: I could use XML and define my own semantics with <![CDATA[<sidebar>]]>, <![CDATA[<news>]]>, <![CDATA[<status>]]> and all that and I'm sure people would have found those XML documents much easier to write (since the XML syntax is very similar to the HTML one and very user friendly)... but I would have moved from "all browsers" to "no browser". </p> <p> And having a documentation that nobody can browse is totally useless. </p> <p> The turning point was the creation of the XSL specification which included a way to "transform" an XML page into something else. (it's more complex than this, but, again, I'll skip the technical details). </p> <p> So now you have: </p> <source><![CDATA[ XML page ---(transformation)--> HTML page ^ | transformation rules ]]></source> <p> that allows you to write your pages in XML, create your "graphics" as transformation rules and generate HTML pages on the fly directly from your web server. </p> <p> Apache Cocoon 1.0 did exactly this. </p> </s1> <s1 title="The Model Evolves"> <p> If XML is a lingua franca, it means that XML software can work on almost anything without caring about what it is. So, if a cell phone requests the page, Cocoon just has to change transformation rules and send the WAP page to the phone. Or, if you want a nice PDF to printout your monthly report, you change the transformation rules and Cocoon creates the PDF for you, or the VRML, or the VoiceML, or your own proprietary B2B markup. </p> <p> Anything without changing the basic architecture that is simply based on the simple "angle bracket" XML syntax. </p> </s1> <s1 title="Separation of Concerns (SoC)"> <p> Cocoon was not the first product to perform server side XML transformations, nor will be the last one (in a few years, these solutions will be the rule rather than the exception). So, what is the "plus" that the Cocoon project adds? </p> <p> We believe the single most important Cocoon innovation is SoC-based design. </p> <p> SoC is something that you've always been aware of: not everybody is equal, not everybody performs the same job with the same ability. </p> <p> It can be observed that separating people with common skills in different working groups increases productivity and reduces management costs, but only if the groups do not overlap and have clear "contracts" that define their operability and their concerns. </p> <p> For a web publishing system, the Cocoon project uses what we call the <em>pyramid of contacts</em> which outlines four major concern areas and five contracts between them. Here is the picture: </p> <figure src="images/pyramid-model.gif" alt="The Cocoon Pyramid Model of Contracts" width="313" height="159"/> <p> Cocoon is <em>engineered</em> to provide you a way to isolate these four concern areas using just those 5 contracts, removing the contract between style and logic that has been bugging web site development since the beginning of the web. </p> <p> Why? because programmers and graphic people have very different skills and work habits... so, instead of creating GUIs to hide the things that can be harmful (like graphic to programmers or logic to designers), Cocoon allows you to separate the things into different files, allowing you to "seal" your working groups into separate virtual rooms connected with the other rooms only by those "pipes" (the contracts), that you give them from the management area. </p> <p> Let's have an example: </p> <source><![CDATA[ <page> <content> <para>Today is <dynamic:today/></para> </content> </page> ]]></source> <p> is written by the content writers and you give them the "contract" that states that the tag <![CDATA[<dynamic:today/>]]> prints out the time of the day when included in the page. Content writers don't care (nor should) about what language has been used for that, nor they can mess up with the programming logic that generates the content since it's stored in another part of the system they don't have access to. </p> <p> So <![CDATA[<dynamic:today/>]]> is the "logic - content" contract. </p> <p> At the same time, the structure of the page is given as a contract to the graphic designers who have to come up with the transformation rules that transform this structure in a language that the browser can understand (HTML, for example). </p> <p> So, the page structure is the "content - style" contract. </p> <p> As long as these contract don't change, the three areas can work in a completely parallel way without saturating the human resources used to manage them: costs decrease because time to market is reduced and maintenance costs is decreased because errors do not propagate out of the concern areas. </p> <p> For example, you can tell your designers to come up with a "Xmas look" for your web site, without even telling the other people: just switch the XMas transformation rules at XMas morning and you're done.... just imagine how painful it would be to do this on your web site today. </p> <p> With the Cocoon architecture all this is a couple of line changes away. </p> </s1> <s1 title="Here we go"> <p> If you reached this far by reading all sections, you should have grasped the value of the Cocoon Project and distinguish most of the marketing hype that surrounds XML and friends. </p> <p> Just like you shouldn't care if somebody offers you a software that is "ASCII compliant" or "ASCII based", you shouldn't care about "XML compliant" or "XML based": it doesn't mean anything. </p> <p> Cocoon uses XML as a core piece of its framework, but improves the model to give you the tools you need and is designed to be flexible enough to follow your needs as well as paradigm shifts that will happen in the future. </p> </s1> </body> </document> 1.1 xml-cocoon2/src/documentation/xdocs/drafts/newtoc.txt Index: newtoc.txt =================================================================== -------------------------------------------------------------------------- Cocoon - New Document Structure (inital author Gerhard Froehlich) -------------------------------------------------------------------------- 1. Introduction 1.1 What is it 1.2 Passive APIs vs. Active APIs 1.3 Reactors Reconsidered 1.4 Management Considerations 1.5 Overlapping contexts and Chain Mapping 1.6 Pre-compilation, Pre-generation and Caching * 2. Getting Started 2.1 What you should know/read 2.1.1 FAQ 2.1.2 XML references 2.1.3 Avalon * 2.2 Basics 2.2.1 Pipeline processing * 2.3 Installation Guide 2.3.1 Download 2.3.2 Installing 2.3.3 Jars 3. Samples 3.1 Hello World XML Sample 3.1.1 Content 3.1.2 Stylesheet 3.1.3 Sitemap * 3.2 Hello World XSP sample * 3.3 Aggregation * 4. In Detail 4.1 Sitemap Components 4.1.1 Generators 4.1.2 Transformers 4.1.3 Serializers 4.1.4 Matchers 4.1.5 Selectors 4.1.6 Actions 4.1.7 XSP 4.2 Core Components 4.2.1 Pipelines 4.2.2 MRUMemoryStore 4.2.3 StoreJanitor * 5. Developer's Corner 5.1 API 5.2 Extending 5.3 Flow 5.4 Using Databases 5.5 Parent CM * 6. Configuration 6.1 web.xml 6.2 cocoon.xconf 6.3 logkit.xconf * 7. Tuning 7.1 Component pools 7.2 Cache 7.3 Pipeling tuning ** 8. Installation references * 9. Project Resources 9.1 ToDo 9.2 Contributing 9.4 3rd Party 9.5 Code Repository 9.6 Dev Snapshots 9.7 Changes 9.8 Bug Database 9.9 Mail List 9.10 Mail Archive 9.11 Live Sites 9.12 Hosting * 10. Who we are
---------------------------------------------------------------------- In case of troubles, e-mail: [EMAIL PROTECTED] To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]