Hello all, I'm having some trouble with a numeric chracter reference. I have some well-formed UTF-8 encoded html that i am pulling from a database table and would like to parse and manipulate with dom4j. Some of the html contains numeric character references like ” to represent a right close quotation mark. After creating a Document object with SAXReader, the references are converted to a single character. For example, ” is converted to, when viewed in a hex editor, 1C.
So I guess I'd like to know whether there is a means of disabling the processing of numeric character references? I realize this may be a parser issue but was curious if anyone had run into a similar problem. Thanks in advance for any help. Kevin __________________________________ Do you Yahoo!? New and Improved Yahoo! Mail - 100MB free storage! http://promotions.yahoo.com/new_mail ------------------------------------------------------- SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media 100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33 Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift. http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285 _______________________________________________ dom4j-user mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dom4j-user