Andrew Stevens wrote: > Tobia Conforto writes: > > I cannot change this data source component, therefore I need a > > transformer to examine every text node in the stream, split it at the > > fake "<br>" tags, substitute them with <xhtml:br/> elements, and > > replace every escaped HTML entity with the relevant Unicode character. > > We have something similar in our application; I arrange the early part > of the pipeline so that the escaped HTML appears within a unique > element e.g. > > <some_escaped_html>Lorem ipsum <br> dolor</some_escaped_html> > > pass it through the html transformer > > <map:transform type="html"> > <map:parameter name="tags" value="some_escaped_html"/> > </map:transform> > > and follow that by a small xsl transformation to strip out the > some_escaped_html elements and the html & body elements that JTidy > inserts. > > Net result, the same SAX stream but with the HTML unescaped and > cleaned up so it's well-formed again.
Thank you. After extensive testing, turns out this is the best method. It works for any kind of malformed HTML and is efficient enough, provided I put <some_escaped_html> tags only where they are needed. Tobia --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
