I'm using JTidy to convert a string containing some HTML to XHTML in a DOM tree. I can't get the foreign characters like יטא converted to the XHTML counterpart. What setting do I need to use???
Here's a code snip from my XSP page: String strContent = request.getParameter("content"); ByteArrayInputStream in = new ByteArrayInputStream( strContent.getBytes() ); String strOut = ""; org.w3c.dom.Document doc = null; org.w3c.tidy.Configuration conf = new org.w3c.tidy.Configuration(); try { Tidy tidy = new Tidy(); //create output as XML tidy.setXmlOut(true); //output should be XHTML conforming tidy.setXHTML(true); tidy.setBreakBeforeBR(false); tidy.setRawOut(false); tidy.setCharEncoding( conf.UTF8 ); //do not output 'non-breaking space' as entity. tidy.setQuoteNbsp(true); //output naked ampersand as & tidy.setQuoteAmpersand(true); //drop presentation tags tidy.setLiteralAttribs(true); //parse the stream to a DOM document doc = tidy.parseDOM(in, null); } catch (Exception e) { } Bert *Friends Are Angels Who Lift Us To Our Feet When Our Wings Have Trouble Remembering How To Fly* --------------------------------------------------------------------- Please check that your question has not already been answered in the FAQ before posting. <http://xml.apache.org/cocoon/faqs.html> To unsubscribe, e-mail: <[EMAIL PROTECTED]> For additional commands, e-mail: <[EMAIL PROTECTED]>