Hi Kevin,
this is something the SAX parser does, I don't think that you can disable this behaviour (although I saw tricks for this on the xerces user mainling list a while ago, perhaps you could try to search their archives)
regards, Maarten
Kevin Varley wrote:
Hello all,
I'm having some trouble with a numeric chracter reference. I have some well-formed UTF-8 encoded html that i am pulling from a database table and would like to parse and manipulate with dom4j. Some of the html contains numeric character references like ” to represent a right close quotation mark. After creating a Document object with SAXReader, the references are converted to a single character. For example, ” is converted to, when viewed in a hex editor, 1C.
So I guess I'd like to know whether there is a means of disabling the processing of numeric character references? I realize this may be a parser issue but was curious if anyone had run into a similar problem.
Thanks in advance for any help.
Kevin
__________________________________
Do you Yahoo!?
New and Improved Yahoo! Mail - 100MB free storage!
http://promotions.yahoo.com/new_mail
------------------------------------------------------- SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media 100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33 Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift. http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285 _______________________________________________ dom4j-user mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dom4j-user
------------------------------------------------------- SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media 100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33 Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift. http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285 _______________________________________________ dom4j-user mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dom4j-user