http://bugzilla.novell.com/show_bug.cgi?id=551615
User [email protected] added comment http://bugzilla.novell.com/show_bug.cgi?id=551615#c6 --- Comment #6 from Greg Smolyn <[email protected]> 2009-11-02 15:05:15 MST --- Ok, I have discovered the spot of the bug. Decoder.Convert() uses an interesting mechanism for determining how many characters it has decoded. The current method goes something like this: - looks at byteArray, startingIndex, and count of bytes to scan - GetCharCount() for that entire block of bytes - if there are more chars in that block than the number of chars we actually want to convert, bit-shift the # of bytes to scan by 1 - repeat, until chars-for-our-current-blocksize <= chars-we-want - given the new parameters, actually do a GetChars(), since we have the right byte block size. This fails under the following scenario: - chars-we-want is 1 - the byte array contains [ single-byte char, double-byte char, ... ] (there might be an extra stipulation about odd numbers?) What happens? For example-- say you have 1 ASCII followed by 2 double-byte chars. You get a startingIndex of 0 and a count of 5 to scan. That's 3 chars, but we only want 1, so we bit shift and our new count of things to scan is 2. We repeat, and GetCharCOunt() says there is only 1 character in the first 2 bytes. That is <= the # of chars we want, so we convert and exit. However, we report the # of bytes used as 2, since we think there was 1 char made up of 2 bytes. However, it wasnt a double-byte char, and we counted the actual start of the double-byte char as a part of the ASCII char. I'm not really sure what a good fix would be for this. Ultimately it looks to me like Convert() really should just try to convert one character at a time, instead of doing the strange GetCharCount() and using a log(n) algorithm to determine how many characters there are. As it stands, there is a workaround of only feeding the decoder 1 byte at a time, which will probably be more performant when trying to get 1 character at a time out of the decoder. I'm happy to attempt a patch, however if I could get some input as to what the preferred course of action would be, or if someone wants to discuss the design of this with me so we can come up with the right fix, I'd be very grateful. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug. You are the assignee for the bug. _______________________________________________ mono-bugs maillist - [email protected] http://lists.ximian.com/mailman/listinfo/mono-bugs
