[ 
https://issues.apache.org/jira/browse/VYSPER-338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhang JinYan updated VYSPER-338:
--------------------------------

    Attachment: XMLParser.patch

Patch for XMLParser to resolve xml unescape bug
                
> XMLParser.unescape throw exception
> ----------------------------------
>
>                 Key: VYSPER-338
>                 URL: https://issues.apache.org/jira/browse/VYSPER-338
>             Project: VYSPER
>          Issue Type: Bug
>          Components: core protocol
>    Affects Versions: 0.7
>         Environment: java1.6, windows7
>            Reporter: Zhang JinYan
>         Attachments: XMLParser.patch
>
>
> If message stanza contains text: "辽宁" (escape before 
> send:"辽宁")
> exception will be throw out:
> Caused by: org.xml.sax.SAXParseException: For input string: "8FBD;&#x5B81"
>       at 
> org.apache.vysper.xml.sax.impl.XMLParser.fatalError(XMLParser.java:499)
>       at org.apache.vysper.xml.sax.impl.XMLParser.parse(XMLParser.java:124)
>       at 
> org.apache.vysper.xml.sax.impl.DefaultNonBlockingXMLReader.parse(DefaultNonBlockingXMLReader.java:185)
>       at 
> org.apache.vysper.xml.decoder.XMPPDecoder.doDecode(XMPPDecoder.java:117)
> Caused by:String org.apache.vysper.xml.sax.impl.XMLParser.unescape(String s)
>     private String unescape(String s) {
>         s = s.replace("&", "&").replace(">", ">").replace("<", 
> "<").replace("&apos;", "'").replace("&quot;",
>                 "\"");
>         StringBuffer sb = new StringBuffer();
>         Matcher matcher = UNESCAPE_UNICODE_PATTERN.matcher(s);
>         int end = 0;
>         while (matcher.find()) {
>             boolean isHex = matcher.group(1).equals("x");
>             String unicodeCode = matcher.group(2);
>             int base = isHex ? 16 : 10;
>             int i = Integer.valueOf(unicodeCode, base).intValue();
>             char[] c = Character.toChars(i);
>             sb.append(s.substring(end, matcher.start()));
>             end = matcher.end();
>             sb.append(c);
>         }
>         sb.append(s.substring(end, s.length()));
>         return sb.toString();
>     }
> Replace xml predefined entities before unescape change the context of escaped 
> strings.
> For example:
> Input:  "&amp;#x8FBD;&amp;#x5B81;"
> After replace: "&#x8FBD;&#x5B81;"
> unescape use regex: Pattern.compile("\\&\\#(x?)(.+);");
> match: 
> group(1) = x
> group(2) = 8FBD;&#x5B81
> then Integer.valueOf(unicodeCode, base) will throw exception.
> I fixed this bug, see the patch. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to