[
https://issues.apache.org/jira/browse/XERCESJ-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697205#action_12697205
]
Michael Glavassevich commented on XERCESJ-1257:
-----------------------------------------------
> Though, I would think testing on Wikipedia's XML exports would make a good
> standard nightly or pre-release test case for Xerces, for correctness and
> performance.
Perhaps, but no so great for stepping through with a debugger or viewing in a
hex editor to examine the bytes.
> buffer overflow in UTF8Reader for characters out of BMP
> -------------------------------------------------------
>
> Key: XERCESJ-1257
> URL: https://issues.apache.org/jira/browse/XERCESJ-1257
> Project: Xerces2-J
> Issue Type: Bug
> Components: JAXP (javax.xml.parsers)
> Affects Versions: 2.9.0
> Environment: Any
> Reporter: Robert Stojnic
> Assignee: Michael Glavassevich
> Priority: Minor
> Attachments: TestXerces.java, UTF8Reader.patch
>
>
> There is a ArrayOutOfBoundsException in org.apache.xerces.impl.io.UTF8Reader,
> in read(char[],int,int) for 4-byte utf-8 chars.
> Imagine a following scenario. read() has a buffer of size N, and it reads N-1
> ascii chars, and stores it in the output buffer. Let the Nth char be the
> first byte of a 4 byte utf-8 char. The other 3 bytes are fetched by invoking
> read() on the input stream. From these a surrogate pair of java chars is
> made, however, method does not check if both chars can fit into the output
> buffer ... In most cases, they would fit into the ouput buffer (e.g. if there
> are some other multi-byte chars in the fetched text), so the bug is very
> rare, but it still happens.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]