It occurs to me I didn't actually answer you principal question... On Wed, Nov 14, 2012 at 11:26 PM, Christophe Dupriez <dupr...@destin.be> wrote: [...] > Excuse me to not share your enthusiasm for STAX: it is essential for big > documents (I use it for that: RDF to XML transformations...) but WikiPages > are not that long and templates are hard enough to keep them unconstrained. > Anyway the main problem today is to DEFINE the process to translate > (normalize) XHTML into WikiMarkup. XSLT is certainly a way to experiment > (and share results). Let's start something like bringing together test > cases?
I think if you were to look at an XSLT approach it's not quite so bad. Whereas in XHtmlElementToWikiTranslator.java you see else if( n.equals( "h2" ) ) { m_out.print( "\n!!! " ); print( e ); m_out.println(); } basically the pattern in XSLT would be something akin to: <xsl:template name="h2"> <xsl:text> !!! </xsl:text> <xsl:apply-templates/> </xsl:template> where we match via an XPath and output WikiMarkup. It would also be a lot more reliable. The big question might seem to be whether or not the input XHTML is truly well-formed XML or not. By definition in XHTML it *must* be but of course in the real world that might not be the case, and the XSLT wouldn't accept non-WF XML. But I'm assuming that given the input to XHtmlElementToWikiTranslator.java is a DOM Document we're already past that hurdle. So what we'd need to do to define the transformation would be to have a set of XPaths (particular markup patterns in XHTML) and what each XPath would generate in WikiMarkup. If we were to go to that trouble the XSLT solution would be almost a byproduct of that work. Ichiro