EMMEL Thomas wrote:
Hi,

Lately I read in a document from the forrest site (a pdf I do not find yer again...) that the pipeline for native html input is
something like:
HTML -> JTity and Cocoon -> html-to-document.xsl -> ...?? ...-> HTML-output or PDF or ...

The document you refer to is probably [1]

Is this right so far? Mainly the JTity and Cocoon pipe: Is that configurable for example to avoid JTidy to clean several topics from my HTML?

JTidy is highly configurable (see the JTidy website) however, it can't remove chunks of your HTML, it's job is to tidy up the existing HTML - make it well formed etc. If you want to remove chunks of your HTML you need a custom transformation, this is documented in [1] (see "Customizing the html pipeline")

Cocoon is the application framework Forrest is built on there is no "cocoon" pipeline, it is the pipeline "engine".

Can I catch the output just before it goes to html-to-document.xsl for debugging?

Yes, override the match that does the transformations in your project sitemap and remove the line that does the html-to-document transformation.

My target is some extra pipeline I add in front for example to add an xinclude to the HTML which possibly can be used later in the process... I am looking for ways to automatically create a section numbering in my documents and other useful stuff like indexing and maybe a bibliography framework.

Section numbering should be done at the skinning stage not at the transformation to XDoc. It is part of the rendering not the content.

As for bibliography there is a plugin in the whiteboard that goes someway towards this. Documentation is non-existent (well it's the code) and more work is needed, but it is a good starting point.

Ross

[1] http://forrest.apache.org/docs_0_80/howto/howto-custom-html-source.html