[jira] [Commented] (DIRMINA-1181) Exception thrown when attempting to decode certain UTF16 chars

Pete Disdale (Jira) Thu, 26 Sep 2024 07:23:08 -0700


    [ 
https://issues.apache.org/jira/browse/DIRMINA-1181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17885037#comment-17885037
 ]


Pete Disdale commented on DIRMINA-1181:
---------------------------------------

Hi Emmanuel,

Thanks for your prompt reply. I agree with you that the existing implementation 
is over complicated, but suspect the alternative you suggest above might be too 
simple in that it returns the *entire* IoBuffer as a String including any 
trailing nulls (which the original code does not. The output using this code is

!image-2024-09-26-14-43-42-488.png!

Which I believe is

{{buf = ABC[nul][nul][nul][nul][nul]}}
{{buf = MĀORI[nul][nul][nul]}}

The original implementation appears to be looking for a NUL terminator 
(C-style, no idea why), but replacing the "new String(...)" with a getChar() 
loop seems to give the desired output:

!image-2024-09-26-14-53-24-875.png!

The code I used to produce this was something like:

 
{code:java}
        StringBuilder sb = new StringBuilder();
        char ch;
        while ((ch = buf.getChar()) != (char) 0) {
            sb.append(ch);
        }
        return sb.toString();
{code}
which produces the expected output but obviously has no array bounds checking 
so is far from a complete solution (and has not been checked for any other 
Charset encodings). It also make no reference to a CharsetDecoder; there is no 
method getChar(CharsetDecoder) as there is with getString.

Thanks again,

Pete

 

 

 

> Exception thrown when attempting to decode certain UTF16 chars
> --------------------------------------------------------------
>
>                 Key: DIRMINA-1181
>                 URL: https://issues.apache.org/jira/browse/DIRMINA-1181
>             Project: MINA
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 2.1.6
>         Environment: Linux, Windows, Java 8, Java 17
>            Reporter: Pete Disdale
>            Priority: Major
>         Attachments: MacronTest-1.java, MacronTest.java
>
>
> When trying to decode a UTF16BE input stream containing characters ot the 
> form \uxx00, for example \u0100 (capital A with macron) the method 
> *AbstractIoBuffer.getString(CharsetDecoder)* incorrectly interprets the 
> second byte as a null terminator (causing a 
> java.nio.charset.MalformedInputException to be thrown) despite this null byte 
> being mid-character (at an odd index). The attached file, MacronTest, 
> demonstrates the issue and when run produces the following output:
> buf = ABC
> Exception in thread "main" java.nio.charset.MalformedInputException: Input 
> length = 1
>     at 
> java.base/java.nio.charset.CoderResult.throwException(CoderResult.java:274)
>     at 
> org.apache.mina.core.buffer.AbstractIoBuffer.getString(AbstractIoBuffer.java:1669)
>     at MacronTest.<init>(MacronTest.java:61)
>     at MacronTest.main(MacronTest.java:13)
> It looks like this issue is also in the 2.2.X branch (3.X/trunk not checked).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (DIRMINA-1181) Exception thrown when attempting to decode certain UTF16 chars

Reply via email to