[
https://issues.apache.org/jira/browse/IO-780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17711546#comment-17711546
]
Marcono1234 commented on IO-780:
--------------------------------
This is not an "underflow", at least not a {{CoderResult.UNDERFLOW}}. For the
example snippet above the {{StringReader}} is done after the incomplete
surrogate pair; {{ReaderInputStream}} calls {{encode}} with {{endOfInput=true}}
which leads to {{CoderResult.MALFORMED}}, which is erroneously ignored.
Also note that this is not limited to unpaired surrogates; a similar situation
can probably also occur when {{encode}} returns {{CoderResult.OVERFLOW}} which
is erroneously ignored as well for {{endOfInput=true}} (for a Charset which
does not write anything on {{flush}}).
> ReaderInputStream discards some encoding errors
> -----------------------------------------------
>
> Key: IO-780
> URL: https://issues.apache.org/jira/browse/IO-780
> Project: Commons IO
> Issue Type: Bug
> Components: Streams/Writers
> Affects Versions: 2.11.0
> Reporter: Marcono1234
> Priority: Major
>
> h3. Description
> {{org.apache.commons.io.input.ReaderInputStream}} discards encoder errors in
> some cases instead of properly rethrowing them.
> The underlying issue is that {{lastCoderResult}} is re-assigned before it has
> been checked for errors and overflow ([link to
> code|https://github.com/apache/commons-io/blob/b9e4f5e6e718ec8e4156e31bef733874700d7cbf/src/main/java/org/apache/commons/io/input/ReaderInputStream.java#L267]).
> This was originally mentioned in pull request
> [#293|https://github.com/apache/commons-io/pull/293].
> h3. Example
> The {{read()}} call in the following example should throw an exception, but
> currently it erroneously returns -1.
> {code}
> // Encoder which throws on malformed or unmappable input
> CharsetEncoder encoder = StandardCharsets.UTF_8.newEncoder();
> ReaderInputStream in = new ReaderInputStream(new StringReader("\uD800"),
> encoder);
> // BUG: This should have thrown an exception because the input is malformed
> System.out.println("Read: " + in.read());
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)