Bugs item #1077487, was opened at 2004-12-02 12:04
Message generated for change (Comment added) made by maartenc
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=116035&aid=1077487&group_id=16035
Category: None
Group: None
>Status: Closed
>Resolution: Invalid
Priority: 5
Submitted By: Nobody/Anonymous (nobody)
>Assigned to: Maarten Coene (maartenc)
Summary: DocumentHelper.parseText error with non ASCII characters
Initial Comment:
DocumentHelper.parseText cannot correctly parse string
has non ASCII characters
----------------------------------------------------------------------
>Comment By: Maarten Coene (maartenc)
Date: 2004-12-14 21:39
Message:
Logged In: YES
user_id=178745
According to the XML spec, the allowed characters are:
#x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] |
[#x10000-#x10FFFF]
as you can see: � isn't an allowed character. So it's
perfectly OK for an XML parser to reject that character.
This is also illustrated if you use Xerces instead of
Crimson: you'll get this error:
org.dom4j.DocumentException: Error on line 1 of document :
Character reference "�" is an invalid XML character.
Nested exception: Character reference "�" is an invalid
XML character.
at org.dom4j.io.SAXReader.read(SAXReader.java:433)
at ...
regards,
Maarten
----------------------------------------------------------------------
Comment By: tommess (tangzg)
Date: 2004-12-13 02:32
Message:
Logged In: YES
user_id=1176527
non ASCII characters : �
----------------------------------------------------------------------
Comment By: tommess (tangzg)
Date: 2004-12-12 13:57
Message:
Logged In: YES
user_id=1176527
Example:
-----------------------------------------------
Document doc = DocumentHelper.parseText("<testTAG>庙前街
�</testTAG>");
Error output:
----------------------------------------------
org.dom4j.DocumentException: Error on line 1 of document :
非法 XML 字符�; Nested exception:
非法 XML 字符�;
at org.dom4j.io.SAXReader.read
(SAXReader.java:355)
at org.dom4j.io.SAXReader.read
(SAXReader.java:271)
at org.dom4j.DocumentHelper.parseText
(DocumentHelper.java:215)
at org.dom4j.test.test.main(test.java:71)
Nested exception:
org.xml.sax.SAXParseException: 非法 XML 字符�;
at org.apache.crimson.parser.Parser2.fatal
(Parser2.java:3182)
...
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody)
Date: 2004-12-07 12:14
Message:
Logged In: NO
Could you give an example illustrating the problem?
Maarten
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=116035&aid=1077487&group_id=16035
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://productguide.itmanagersjournal.com/
_______________________________________________
dom4j-dev mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dom4j-dev