[ https://issues.apache.org/jira/browse/DIRMINA-1181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17885037#comment-17885037 ]
Pete Disdale commented on DIRMINA-1181: --------------------------------------- Hi Emmanuel, Thanks for your prompt reply. I agree with you that the existing implementation is over complicated, but suspect the alternative you suggest above might be too simple in that it returns the *entire* IoBuffer as a String including any trailing nulls (which the original code does not. The output using this code is !image-2024-09-26-14-43-42-488.png! Which I believe is {{buf = ABC[nul][nul][nul][nul][nul]}} {{buf = MĀORI[nul][nul][nul]}} The original implementation appears to be looking for a NUL terminator (C-style, no idea why), but replacing the "new String(...)" with a getChar() loop seems to give the desired output: !image-2024-09-26-14-53-24-875.png! The code I used to produce this was something like: {code:java} StringBuilder sb = new StringBuilder(); char ch; while ((ch = buf.getChar()) != (char) 0) { sb.append(ch); } return sb.toString(); {code} which produces the expected output but obviously has no array bounds checking so is far from a complete solution (and has not been checked for any other Charset encodings). It also make no reference to a CharsetDecoder; there is no method getChar(CharsetDecoder) as there is with getString. Thanks again, Pete > Exception thrown when attempting to decode certain UTF16 chars > -------------------------------------------------------------- > > Key: DIRMINA-1181 > URL: https://issues.apache.org/jira/browse/DIRMINA-1181 > Project: MINA > Issue Type: Bug > Components: Core > Affects Versions: 2.1.6 > Environment: Linux, Windows, Java 8, Java 17 > Reporter: Pete Disdale > Priority: Major > Attachments: MacronTest-1.java, MacronTest.java > > > When trying to decode a UTF16BE input stream containing characters ot the > form \uxx00, for example \u0100 (capital A with macron) the method > *AbstractIoBuffer.getString(CharsetDecoder)* incorrectly interprets the > second byte as a null terminator (causing a > java.nio.charset.MalformedInputException to be thrown) despite this null byte > being mid-character (at an odd index). The attached file, MacronTest, > demonstrates the issue and when run produces the following output: > buf = ABC > Exception in thread "main" java.nio.charset.MalformedInputException: Input > length = 1 > at > java.base/java.nio.charset.CoderResult.throwException(CoderResult.java:274) > at > org.apache.mina.core.buffer.AbstractIoBuffer.getString(AbstractIoBuffer.java:1669) > at MacronTest.<init>(MacronTest.java:61) > at MacronTest.main(MacronTest.java:13) > It looks like this issue is also in the 2.2.X branch (3.X/trunk not checked). -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@mina.apache.org For additional commands, e-mail: dev-h...@mina.apache.org