The discussions around Stefano's "Cocoon blocks version 1.1" showed the
need for pipelines to provide not only resources, but also services,
identified by their URI.
This document defines this concept of "pipeline service", which, as we
will see, consists in using pipelines as sitemap components (generator,
transformer and serializer). It is separated from the blocks design
document since pipeline services can be used without blocks, even if
they will be mostly useful in that context.
What is a pipeline ?
--------------------
The concept of pipeline, a central part of the Cocoon architecture, is a
chain of components handling XML documents as SAX events. By "handling",
we mean 3 different things :
- generate : at the start of the chain, produce an initial document and
feed the next component in the chain with the result.
- transform : take the content produced by preceding components in the
chain (either a generator or another transformer), transform it and feed
the next component in the chain with the result.
- serialize : take the content produced by the preceding component in
the chain (either a generator or a transformer), and convert this XML
stream to a binary stream.
These 3 concepts are represented using only 2 interfaces, XMLProducer
and XMLConsumer :
- a generator is an XMLProducer,
- a transformer is an XMLConsumer _and_ an XMLProducer,
- a serializer is an XMLConsumer.
The "cocoon:" protocol
----------------------
Up to now, we've considered pipelines as a "final" concept. This means
that a pipeline has to be considered as a whole : it handles a request
and answers by the result of it's execution.
Well, in fact, we "nearly" considered it as final. Consider the
"cocoon:" protocol that is so useful. What happens if we write the
following :
<map:match pattern="first-uri">
<map:generate type="file" src="cocoon://other-uri"/>
<map:transform src="foo.xslt"/>
<map:serialize/>
</map:match>
We're simply using another pipeline as the starting point of the current
one. We have used a pipeline as the generator of another one.
Most often, the "other-uri" builds a pipeline that is terminated by a
<map:serialize type="xml"/> because we want it to produce xml for the
calling generator. But this serializer is a fake : you can put any
serializer you like, it doesn't matter. What happens under the hood is
that the SAX events produced by the component immediately preceding the
serializer are used as the output of the generator in the calling pipeline.
So in the above example, when requesting "first-uri", we actually chain
the generators and transformers of "other-uri" to the transformers and
serializer of "first-uri".
Pipelines as generators
-----------------------
This leads to a first conclusion : using a pipeline as a generator means
using the SAX events produced by the last XMLProducer of that pipeline,
i.e. the last transformer or the generator if there are no transformers.
Since we've used a pipeline as a genererator, let's introduce a new
generator for this purpose, instead of using the "file" one, which fools
us in thinking we use a full pipeline when it actually strips out the
serializer :
<map:match src="first-uri">
<map:generate type="pipeline" src="/other-uri"/>
<map:transform src="foo.xslt"/>
<map:serialize/>
</map:match/>
I don't see a need for a new sitemap element such as "map:call-pipeline"
or "map:generate-from-pipeline". What we want is to generate and initial
content in the current pipeline, and for this we just use a particular
implementation of a generator, as we already do for files, XSP, etc.
Pipelines as serializers
------------------------
We've seen how to use a pipeline as the generator of another one, let's
consider now the other end of the chain : using a pipeline as a serializer.
Let's suppose have defined a pipeline that gets an XML document in the
xdoc DTD and formats it to PDF. This can be for example :
<map:match pattern="doc2pdf">
<map:generate src="an_xdoc.xml"/>
<map:transform src="doc2fo.xslt"/>
<map:serialize type="fo2pdf"/>
</map:match>
The interesting part here isn't the initial document, but the chaining
of a stylesheet that produces an xsl:fo version of its input and the FOP
serializer. This is the typical example of what is called a "service" in
the current block specification.
Now how do we reuse this in other pipelines ? Yes, we can define a
<map:resource>. But this resource will be available only in the current
sitemap, and not in other sitemaps nor blocks.
What actually means "reusing" this ? This means producing a xdoc
document and _serializing_ it to PDF. We don't actually care if there is
a serializer to PDF that directly accepts xdocs or if there are one or
more transformations before serializing.
This leads to a second conclusion : using a pipeline as a serializer
means sending the SAX events of the calling pipeline to the first
XMLConsumer of the called pipeline.
How do we use this ? Well, just as for the generator, let's define a new
"pipeline" serializer :
<map:generate src="another_xdoc.xml"/>
<map:serialize type="pipeline" src="doc2pdf"/>
Note : the "src" attribute doesn't currently exist on <map:serialize>,
but it seems the more natural and consistent way to name the called
pipeline. Wether this translates to implementing SitemapModelComponent
or not is another story.
Pipelines as transformers
-------------------------
And here comes the last use of a pipeline : as a transformer. Let's
consider the following :
<map:match pattern="a_page">
<map:generate src="an_xdoc.xml"/>
<map:transform type="i18n"/>
<map:transform src="xdoc2html.xsl"/>
<map:transform src="htmlskin.xsl"/>
<map:serialize type="html"/>
</map:match>
The 3 transformers define a transformation service that takes an xdoc as
input and produces some skinned html. To achieve reusability, we would
like to have a "xdoc2skinnedHtml" transformer. We can write this like
the following :
<map:match pattern="a_page">
<map:generate src="an_xdoc.xml"/>
<map:transform type="pipeline" src="xdoc2skinnedHtml"/>
<map:serialize type="html"/>
</map:match>
and
<map:match pattern="xdoc2skinnedHtml">
<map:generate type="dont_care"/>
<map:transform type="i18n"/>
<map:transform type="xdoc2html.xsl"/>
<map:transform type="htmlskin.xsl"/>
<map:serialize type="dont_care"/>
</map:match>
This leads to a third conclusion : using a pipeline as a transformer
means feeding the SAX events of the calling pipeline to the first
transformer of the called pipeline, and sending the output of the last
transformer of the called pipeline to the next XMLConsumer of the
calling pipeline.
Note : if there are no transformers in the called pipeline (i.e. it's
only a generator and a serializer), the "pipeline" transformer does
nothing and only copies its input to its output.
Relation to blocks
------------------
Up to now, we made no mention of blocks. The "src" attribute of the new
"pipeline" sitemap components is an URI that is considered as what
follows the first "/" in the "cocoon:" protocol :
- "/pipeline-uri" is resolved by calling the root sitemap,
- "pipeline-uri" is resolved by calling the current sitemap.
We can now introduce blocks :
- "block:foo:pipeline-uri" is resolved by calling the "foo" block.
So if we consider the transformer example above, and move the
"xdoc2skinnedHtml" pipeline to a "skin" block, our sitemap becomes :
<map:match pattern="a_page">
<map:generate src="an_xdoc.xml"/>
<map:transform type="pipeline" src="block:skin:xdoc2skinnedHtml"/>
<map:serialize/>
</map:match/>
Questions and answers
---------------------
Q: What about caching when we call a pipeline ?
A: This should integrate smoothly : the cache key and validity of the
"pipeline" generator, transformer and serializer are the composition of
cache keys and validities of the used components of the called pipeline.
--o--
Q: Doesn't this deprecate the use of the "cocoon:" protocol ?
A: No. The only notation that may be deprecated is <map:generate
type="file" src="cocoon://xxx"/> that can now be written <map:generate
type="pipeline" src="/xxx"/>. Other uses of the "cocoon" protocol keep
their usefulness.
--o--
Q: I want do define a pipeline that will be used only as a
transformation service. Why must I write a <map:generate> and a
<map:serialize> in its definition ?
A: Because the sitemap, as a pipeline building language, must be able to
determine the start of a pipeline and its end, even if not all its
components are used. Like opening and closing braces in Java, the
generator begins the pipeline definition and the serializer ends it.
Ok. Thanks for reading so far. What are your thoughts about this ? If we
agree on it, I'll update the Cocoon blocks document so that block
services are shown as "pipeline" sitemap components.
Sylvain
--
Sylvain Wallez Anyware Technologies
http://www.apache.org/~sylvain http://www.anyware-tech.com
{ XML, Java, Cocoon, OpenSource }*{ Training, Consulting, Projects }
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]
- Re: [RT] Using pipeline as sitemap components (long) Sylvain Wallez
- Re: [RT] Using pipeline as sitemap components (long... Colin Paul Adams
- RE: [RT] Using pipeline as sitemap components (... John Morrison
- Re: [RT] Using pipeline as sitemap componen... Sylvain Wallez
- Re: [RT] Using pipeline as sitemap components (long... Gianugo Rabellino
- Re: [RT] Using pipeline as sitemap components (... Sylvain Wallez
- Re: [RT] Using pipeline as sitemap components (long... Christian Haul
- Re: [RT] Using pipeline as sitemap components (... Vadim Gritsenko
- Re: [RT] Using pipeline as sitemap componen... Sylvain Wallez
- Re: [RT] Using pipeline as sitemap componen... Stefano Mazzocchi
- RE: [RT] Using pipeline as sitemap comp... Carsten Ziegeler