On Sun, 3 Sep 2000, Markus Kuhn wrote:
> I hope you'll find it useful for ironing out the last hidden bugs in
> your UTF-8 decoders. Feel free to adapt the file for and include it into
> your UTF-8 regression test suites.
Here's a patch to xterm to fix the last issues it has with this
(surrogates and fffe,ffff are collapsed to UCS_REPL).
Due to the use of FFFF as a placeholder for the right-hand-side of a
doublewidth character, this needed fixing anyway, as sending an FFFF
to xterm was confusing it.
--
Robert
--- xterm-144/ptydata.c Wed Jun 14 20:50:37 2000
+++ xterm-144-fixed/ptydata.c Mon Sep 4 11:21:26 2000
@@ -134,6 +134,10 @@
screen->utf_char <<= 6;
screen->utf_char |= (c & 0x3f);
}
+ if ((screen->utf_char >= 0xd800 && screen->utf_char <=
+0xdfff) ||
+ (screen->utf_char == 0xfffe) || (screen->utf_char ==
+0xffff)) {
+ screen->utf_char = UCS_REPL;
+ }
screen->utf_count--;
if (screen->utf_count == 0)
data->buf2[j++] = screen->utf_char;