[
https://issues.apache.org/jira/browse/DERBY-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500527
]
Rick Hillegas commented on DERBY-2694:
--------------------------------------
Hi, Anurag. I think that the patch does the right thing. However, it's a little
tricky to read. I think that the following approach is easier to understand.
What do you think? I'm not an expert on utf-8 encoding, but the following web
page was useful to me: http://www.unix.org.ua/orelly/java/fclass/appb_01.htm
private static final byte MULTI_BYTE_MASK = (byte) 0xC0;
private static final byte CONTINUATION_BYTE = (byte) 0x80;
if (writeLen != origLen) // if we're truncating the string
{
while ( isContinuationChar( byteval[ writeLen ] ) ) { writeLen--; }
//
// Now byteval[ writeLen ] is either a standalone 1-byte char
// or the first byte of a multi-byte character. That means that
// byteval[ writeLen -1 ] is the last (perhaps only) byte of the
// previous character.
//
}
private boolean isContinuationChar( byte b )
{
return ( (b & MULTI_BYTE_MASK) == CONTINUATION_BYTE );
}
> org.apache.derby.impl.drda.DDMWriter uses wrong algorithm to avoid spliting
> varchar in the middle of a multibyte char.
> ----------------------------------------------------------------------------------------------------------------------
>
> Key: DERBY-2694
> URL: https://issues.apache.org/jira/browse/DERBY-2694
> Project: Derby
> Issue Type: Bug
> Components: Network Server
> Environment: all
> Reporter: Anurag Shekhar
> Assignee: Anurag Shekhar
> Fix For: 10.3.0.0
>
> Attachments: derby-2694-v2.diff, derby-2694.diff, TestProc.java,
> TestProc_TruncateRep.java
>
>
> org.apache.derby.impl.drda.DDMWriter uses wrong algorithm to avoid splitting
> varchar in the middle of a multibyte char.
> When DMWriter finds that it has to split a varchar while sending it to client
> it checks if the last byte is a part of a multibyte char and in case it is it
> tries to find the last byte of previous char and sends only till that byte
> leaving rest of it for the next send.
> The code it uses is having a bug so it fails when the last byte its checking
> for is the third byte of a char of 3 byte length.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.