This sounds like a job for JTidy. You can use JTidy to parse badly written
XML or HTML and it'll do its best to tidy it up a bit. There's a JTidy
example in dom4j/src/samples/JTidyDemo.java

James
----- Original Message -----
From: "Marc Elliott" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Sunday, March 17, 2002 12:33 AM
Subject: [dom4j-user] Angle Brackets,Transformations and CDATA


I'm struggling a bit with something I thought I was clear on.

Basically, I have content that I can't be sure will be XML-valid markup
because it's being created by retail-level users.  So I've wrapped it up in
a
CDATA element.  Then in the XSL file I'm transforming the XML with I call
the
questionable element with a value-of tag.  Yet, I still end up with angle
brackets being translated to the ASCII equivelent:

Montgomery &lt;i&gt;Partners&lt;/i&gt; Survey on Alternative

I don't really intend to use this within a JSP (not primarily, anyway), and
in any case, the xtags taglib appears to do the same thing, but I've been
testing it in this JSP code:

<%! String content = ""; %>
<%
String xslURL =
"http://intranet.dev.hnw.com:8080/xsl/wealthnews_email_html.xsl";;
String xmlURL = "http://intranet.dev.hnw.com:8080/xml/wealthnews.xml";;

    // file document

    URL url = new URL(xmlURL);
    SAXReader reader = new SAXReader();
    Document document = reader.read(url);

    // transform document

    TransformerFactory factory = TransformerFactory.newInstance();
    Transformer transformer = factory.newTransformer( new
StreamSource(xslURL) );

    // style document using transformer
    DocumentSource source = new DocumentSource( document );
    DocumentResult result = new DocumentResult();

    transformer.setOutputProperty("indent","no");
    transformer.setOutputProperty("omit-xml-declaration","yes");

    transformer.transform( source, result );

    Document transformedDoc = result.getDocument();

    // document transformation to string

    StringWriter buffer = new StringWriter();
    OutputFormat outFormat = new OutputFormat();
    outFormat.createCompactFormat();
    XMLWriter writer = new XMLWriter(buffer,outFormat);
    writer.write(transformedDoc);
    String text = buffer.toString();
    writer.close();
    content = text;

%>

<%= content %>

Any suggestions?  I've tried it using the HTMLWriter as well -- which
interacts better with JSP output but still messes with the markup.


..................................................

Marc Elliott
Director of Information Architecture / HNW Inc.
Digital Solutions for High-Net-Worth Marketers
ph: 617-243-9199 x224
fx: 815-327-4167



..................................................

Marc Elliott
Director of Information Architecture / HNW Inc.
Digital Solutions for High-Net-Worth Marketers
ph: 617-243-9199 x224
fx: 815-327-4167


_______________________________________________
dom4j-user mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dom4j-user


_______________________________________________
dom4j-user mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dom4j-user

Reply via email to