I see this question has been asked before, but I can't make it work.
I'm trying to convert HTML to XML and remove the Dreamweaver
and M$ cruft.  So 'bare' and 'clean' and 'drop-font-tags' are important.

I am using cocoon-2.1_20040514221251.tar.gz (Jetty) on Solaris 9.

In cocoon-2.1/build/webapp/samples/blocks/html:

<?xml version="1.0"?>

<map:sitemap xmlns:map="http://apache.org/cocoon/sitemap/1.0";>

<!-- =========================== Views =================================== -->

<map:components>
<map:generators default="file">
<map:generator label="content" logger="sitemap.generator.html" name="html" src="org.apache.cocoon.generation.HTMLGenerator">
<jtidy-config>jtidy.properties</jtidy-config>
</map:generator>
</map:generators>
</map:components>


<!-- =========================== Pipelines ================================= -->

 <map:pipelines>
   <map:pipeline>
     <map:match pattern="scraper">
       <map:read mime-type="text/html" src="scrape.html"/>
       <map:serialize type="html"/>
     </map:match>

     <map:match pattern="scrape">
       <map:generate type="html" src="{request-param:htmlpage}">
         <map:parameter name="xpath" value="/html"/>
       </map:generate>
       <map:transform src="stylesheets/scrape.xsl"/>
       <map:serialize type="xml"/>
     </map:match>
   </map:pipeline>
 </map:pipelines>
</map:sitemap>

-- and --

cocoon-2.1/build/webapp/samples/blocks/html:more jtidy.properties

add-xml-decl=yes
bare=yes
char-encoding=utf8
clean=yes
doctype=omit
drop-empty-paras=yes
drop-font-tags=yes
input-xml=no
indent=yes
indent-spaces=4
output-xhtml=yes
wrap=0

--
I _think_ HTMLGenerator is reading the properties file - at least it does not
complain in the sitemap.log as it does when I remove jtidy.properties file. But I can't see any effect on the XHTML output. What am I doing incorrectly?


Ray Allis

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to