This patch to libgo fixes the handling of surrogate pairs when
converting a []rune value to string. Invalid surrogate pair values
should become the replacement character in the string, as otherwise the
string will not be valid UTF-8. I have a patch ready for the master Go
testsuite after the 1.2 release goes out. Bootstrapped and ran Go
testsuite on x86_64-unknown-linux-gnu. Committed to mainline and 4.8
branch.
Ian
diff -r fff3cb764986 libgo/runtime/go-int-array-to-string.c
--- a/libgo/runtime/go-int-array-to-string.c Tue Nov 26 10:07:46 2013 -0800
+++ b/libgo/runtime/go-int-array-to-string.c Tue Nov 26 15:13:08 2013 -0800
@@ -30,6 +30,8 @@
if (v < 0 || v > 0x10ffff)
v = 0xfffd;
+ else if (0xd800 <= v && v <= 0xdfff)
+ v = 0xfffd;
if (v <= 0x7f)
slen += 1;
@@ -56,6 +58,8 @@
character. */
if (v < 0 || v > 0x10ffff)
v = 0xfffd;
+ else if (0xd800 <= v && v <= 0xdfff)
+ v = 0xfffd;
if (v <= 0x7f)
*s++ = v;