[ 
https://issues.apache.org/jira/browse/DIRMINA-1181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17885567#comment-17885567
 ] 

Thomas Wolf commented on DIRMINA-1181:
--------------------------------------

[~elecharny] : the limit is the index _beyond_ the last valid byte. So one 
should not read {{{}get(limit()){}}}. The code as checked in does. If 
{{{}limit() == capacity(){}}}, you'll even get an exception.

But as I've stated this several times now without having made myself clear, 
here's a simple JUnit test (for the 2.0.X branch; put it in {{{}MacronTest{}}}. 
I don't see the fix in 2.2.X yet):
{code:java}
@Test
public void testNotZeroTerminatedUtf16String() throws CharacterCodingException {
    IoBuffer buf = IoBuffer.allocate(2);
    buf.put((byte) 0x01);
    buf.put((byte) 0x00);
    buf.position(0);
    String decoded = buf.getString(StandardCharsets.UTF_16BE.newDecoder());
    assertEquals("Ā", decoded);
}{code}
This will fail with
{code:java}
java.lang.IndexOutOfBoundsException
    at java.nio.Buffer.checkIndex(Buffer.java:545)
    at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:142)
    at 
org.apache.mina.core.buffer.AbstractIoBuffer.get(AbstractIoBuffer.java:608)
    at 
org.apache.mina.core.buffer.AbstractIoBuffer.getString(AbstractIoBuffer.java:1607)
    at 
org.apache.mina.core.buffer.MacronTest.testNotZeroTerminatedUtf16String(MacronTest.java:88)
    ...{code}
Using my proposal from above; the test succeeds.

> Exception thrown when attempting to decode certain UTF16 chars
> --------------------------------------------------------------
>
>                 Key: DIRMINA-1181
>                 URL: https://issues.apache.org/jira/browse/DIRMINA-1181
>             Project: MINA
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 2.1.6
>         Environment: Linux, Windows, Java 8, Java 17
>            Reporter: Pete Disdale
>            Priority: Major
>         Attachments: MacronTest-1.java, MacronTest.java
>
>
> When trying to decode a UTF16BE input stream containing characters ot the 
> form \uxx00, for example \u0100 (capital A with macron) the method 
> *AbstractIoBuffer.getString(CharsetDecoder)* incorrectly interprets the 
> second byte as a null terminator (causing a 
> java.nio.charset.MalformedInputException to be thrown) despite this null byte 
> being mid-character (at an odd index). The attached file, MacronTest, 
> demonstrates the issue and when run produces the following output:
> buf = ABC
> Exception in thread "main" java.nio.charset.MalformedInputException: Input 
> length = 1
>     at 
> java.base/java.nio.charset.CoderResult.throwException(CoderResult.java:274)
>     at 
> org.apache.mina.core.buffer.AbstractIoBuffer.getString(AbstractIoBuffer.java:1669)
>     at MacronTest.<init>(MacronTest.java:61)
>     at MacronTest.main(MacronTest.java:13)
> It looks like this issue is also in the 2.2.X branch (3.X/trunk not checked).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@mina.apache.org
For additional commands, e-mail: dev-h...@mina.apache.org

Reply via email to