In preparation for our upcoming IRC session on the topic of converting
Forrest to use a subset of XHTML2 as its internal document format. There
appear to be at least two, if not three (or even more) opinions on this.
The purpose of this thread is not (at least initially) to debate each
opinion, but instead to provide background information to feed into the
IRC session.
If you have a suggestion for an approach then please add it to this
thread. However, please avoid commenting on other proposals that have
gone before (other than to say "as described by..." in cases of agreement).
The idea is for this to be an initial brainstorming thread *not* a
discussion or planning thread. We'll do that later, lets just absorb one
anothers ideas so we can extract the best of them all via IRC
discussion. We can then come back to this thread and wrap up with our
conclusions.
--------
Here's an outline of my approach:
--------
Assumptions
===========
First of all I assume that there is no point in working on anything to
do with the old skinning system. It is going to be removed in favour of
views and I don't want to have to refactor things twice.
I am using forrest:views to define the various technologies that,
together, provide the new skinning system. That is those items defined
in [1]
Defining the Core Pipeline
==========================
The pipeline when using views is discussed in [1] where we define the
pipeline to be either:
theme
|
\|/
src -> input plugin -> core (views) -> output plugin -> output
| /|\
\|/ |
forrest:contracts
As defined in [5] or:
theme
|
\|/
src -> input plugin -> core (views) -> output plugin -> output
/|\ | |
| \|/ \|/
+------------------+
|forrest:contracts |
|forrest:properties|
+------------------+
This later pipeline was suggested because "the contracts as viewHelper
should come *from* the plugin" [2] (actually I reversed the last arrow
from the original post because of this description)
[It should be noted that since these mails were written we have agreed
to rename the part of forrest:views shown here in core as "structurer",
I will use the term structurer in the rest of this mail]
Both of the above are aligned with our TR document [4] which defines the
stages along the central pipeline as:
Resolver -> Xifier -> Filter -> Windower -> Themer -> Serializer
Cool, lots of agreement there :-)
Fitting Forrest:Views into the Pipeline
=======================================
So, we seem to be in agreement on the core pipeline. However, there are
actually two opinions on how views fit in. I am going to really rock the
boat and add a third (even though one of the above is mine ;-)
Why do we need a third? Lets start off by looking at the definitions of
the various parts of this pipeline:
Structurer
----------
The structurer part of a view is defined as adding "a structure to the
generated page that clearly identifies all the content in the final
output" [6] and [7], and further as "The structuring of the assembled
page where all content is in place and structured with forrest:hooks to
provide hooks for theming." [8]
OK, so it is pretty clear that the *.fv files are part of the
structurer. And these belong in core, that is the language used is
defined by Forrest core itself. It is an internal format. Note this
means we can use, for example, the Cocoon Portal page layout language as
an input format for the structurer, or we can generate it as an output
from the structurer.
Note that the structurer does *not* define any content. Therefore core
should *not* have any knowledge of content
Forrest Contracts
-----------------
Forrest:contracts are defined as "the templated content that should be
inserted into the final document. These may create a new request in
order to generate the content" [5] and as "Helpers (forrest:contracts)
mainly adapt and transform the presentation model (pm) for the view, but
also help with any limited business processing that is initiated from
the view (forrest:properties)" [8]
So contracts describe how to retrieve/extracts bits of content (or
nuggets) to be inserted into the final document at locations defined in
the *.fv files (for the structurer).
Output Plugins
--------------
An output plugin is defined as providing "a new output format. For
example, the s5 plugin extends Forrest to produce HTML slides from
Forrest documents." [3]
So an output plugin provides a version of a document that can be
rendered, for example, HTML or FO. It may also provide a theme to
describe how this should be displayed in the final rendering, e.g. CSS
(FO has no separate theme, but the plugin may provide configuration info
for the generated FO).
In my view there is nothing in this definition that describes *content*
and since forrest:contracts are about content they have no place in
output plugins.
However, they do have a place in input input plugins since they *do*
define content. Some examples can be found in my recent work on the
Resume plugin where I have defined contracts to insert the various
portions of a resume into documents.
Finally, they fit!
------------------
So given the definitions/opinions above, I think the processing
pipeline, with views plugged in is:
theme
|
\|/
src -> input plugin -> core (views) -> output plugin -> output
| | /|\ /|\
| | | |
| | \ +------------------+ |
| +---------- |forrest:contracts | |
| / |forrest:properties| |
| +------------------+ |
| |
| |
+--------------------------------------------------------+
Notice that *all* of our contracts are coming from input plugins. Why is
this? The answer will come clear in the next section (I hope).
XHTML2 in Core
==============
So finally we come to the point. What does it mean for XHTML2 to be our
internal document format? First (not quite there yet) lets consider why
we have an internal format:
We want to convert many source formats into many output formats. We want
to do this with minimal effort. So we adopt an internal format and write
a series of output plugins to give us the different formats from that
single internal format. Now we write a load of input plugins to convert
the source formats into our internal format and viola, we have many to
many conversion.
So, everything coming *in* to our core must be our internal format, and
everything coming *out* must be our internal format. There should be
*nothing* inside core fo any other format.
An Example Input Plugin
-----------------------
It is the job of our input plugins to provide the internal format.
Consder a OpenOffice input plugin, it converts the OOo XML format to our
internal format. What forrest:contracts does it provide?
An OOo document consists of meta-data, content (made up of pages,
sections, paragraphs) and style information. So logical contracts would
be various meta-data contracts (authors, statistics, abstract,
keywords), content (all, page X etc.) and style (produces CSS). This way
a user can decide which parts of the original document are used.
An Example Output Plugin
------------------------
It is the job of our output plugins to consume the internal format and
produce our output format. So they take a *fully structured* document
and convert it into the chosen output. Lets consider an HTML output
plugin. What does it provide?
It provides a single XSL that converts XHTML2 to HTML. It may also
provide an XSL to convert an internal style language into CSS (we
currently do not have an internal style language, so lets not go there
just yet, just planting a meme).
What about a PDF output plugin? It provides a single XSL to convert from
XHTML2 to FO.
Concluding Where XHTML2 Fits
----------------------------
It fits in the forrest:contracts and in the internal processing within
core (structurer).
How do we Implement it?
=======================
Lets first consider what we have (in the XHTML2 plugin since this is the
approach I am outlining here):
- we have an XHTML2 based site
- we have the start of the XHTML to HTML stylesheet that will be the
major part of the HTML output plugin
- we have some templates converted to use XHTML2 - these will form the
start of an XHTML2 input plugin
- we have a structurer sitemap that is basically the two existing views
plugins thrown together
Combined these elements will provide the content elements of a page.
They do not currently work with navigation etc. since the aggregation of
navigation has been removed since it belongs in the contracts not in the
structurer (as discussed above).
Roadmap
-------
Now what do we need to do?
- enable the navigation contracts
- convert all contracts to XHTML2
- break out the HTML output plugin
- add theming support
- break out the XHTML2 input plugin
- refactor (or rewrite?) the structurer sitemap (with locationmap in mind)
The Future
==========
This last step (refactor structurer sitemaps) is really part of a larger
effort that to addess the first stage of our pipeline as defined above.
That is the resolving of the source file.
I'll leave that for a whole new Forrest Tuesday.
References
==========
[1] http://marc.theaimsgroup.com/?t=112276643700001&r=1&w=2
[2] http://marc.theaimsgroup.com/?l=forrest-dev&m=112596689428172&w=2
[3]
http://forrest.apache.org/pluginDocs/plugins_0_80/pluginInfrastructure.html#outputPlugins
[4]
http://svn.apache.org/viewcvs.cgi/*checkout*/forrest/trunk/site-author/content/xdocs/TR/2005/WD-forrest10.html
[5] http://marc.theaimsgroup.com/?l=forrest-dev&m=112276632331269&w=2
[6] http://marc.theaimsgroup.com/?l=forrest-dev&m=112277657832032&w=2
[7] http://marc.theaimsgroup.com/?l=forrest-dev&m=112438965225785&w=2
[8] http://marc.theaimsgroup.com/?l=forrest-dev&m=112596689428172&w=2