I'm using JTidy to convert a string containing some HTML to XHTML in a DOM 
tree.  I can't get the foreign characters like יטא converted to the XHTML 
counterpart.  What setting do I need to use???

Here's a code snip from my XSP page:

       String strContent = request.getParameter("content");
       ByteArrayInputStream in = new ByteArrayInputStream( 
strContent.getBytes() );
       String strOut = "";
       org.w3c.dom.Document doc = null;
       org.w3c.tidy.Configuration conf = new org.w3c.tidy.Configuration();
       try {
         Tidy tidy = new Tidy();

         //create output as XML
         tidy.setXmlOut(true);

         //output should be XHTML conforming
         tidy.setXHTML(true);

         tidy.setBreakBeforeBR(false);
         tidy.setRawOut(false);
         tidy.setCharEncoding( conf.UTF8 );

         //do not output 'non-breaking space' as entity.
         tidy.setQuoteNbsp(true);

         //output naked ampersand as &
         tidy.setQuoteAmpersand(true);

         //drop presentation tags
         tidy.setLiteralAttribs(true);

         //parse the stream to a DOM document
         doc =  tidy.parseDOM(in, null);
       } catch (Exception e) {
       }

Bert


*Friends Are Angels Who Lift Us To Our Feet When Our Wings Have Trouble 
Remembering How To Fly*


---------------------------------------------------------------------
Please check that your question has not already been answered in the
FAQ before posting. <http://xml.apache.org/cocoon/faqs.html>

To unsubscribe, e-mail: <[EMAIL PROTECTED]>
For additional commands, e-mail: <[EMAIL PROTECTED]>

Reply via email to