[ http://issues.apache.org/jira/browse/XERCESC-1226?page=all ]
     
David Bertoni closed XERCESC-1226:
----------------------------------


> Parser reports bogus content when parsing
> -----------------------------------------
>
>          Key: XERCESC-1226
>          URL: http://issues.apache.org/jira/browse/XERCESC-1226
>      Project: Xerces-C++
>         Type: Bug
>   Components: SAX/SAX2
>     Versions: Nightly build (please specify the date)
>  Environment: All platforms
>     Reporter: David Bertoni
>  Attachments: diff.txt, test1.xml
>
> When parsing the following document, the parser reports garbage characters.
> <?xml version="1.0"?> 
> <subject>Research [&#x1D538;]rticle</subject>
> I traced this down to this function in XMLReader, starting on line 612:
> inline bool XMLReader::isPlainContentChar(const XMLCh toCheck)
> {
>     return ((fgCharCharsTable[toCheck] & gPlainContentCharMask) != 0);
> }
> Apparently, for the character "]" (U+005D RIGHT SQUARE BRACKET), the flags in 
> fgCharCharsTable indicate it's not plain content.  This causes the parser to 
> misbehave badly, and deliver broken character data, including unpaired low 
> surrogates.
> When I used the debugger, and returned "true" from this function, rather than 
> false, the parser delivered the correct character data.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to