Re: Updated UTF-8 decoder stress test file

Robert Brady Mon, 04 Sep 2000 03:16:26 -0700

On Sun, 3 Sep 2000, Markus Kuhn wrote:

> I hope you'll find it useful for ironing out the last hidden bugs in
> your UTF-8 decoders. Feel free to adapt the file for and include it into
> your UTF-8 regression test suites.

Here's a patch to xterm to fix the last issues it has with this
(surrogates and fffe,ffff are collapsed to UCS_REPL).

Due to the use of FFFF as a placeholder for the right-hand-side of a
doublewidth character, this needed fixing anyway, as sending an FFFF
to xterm was confusing it.

-- 
Robert

--- xterm-144/ptydata.c Wed Jun 14 20:50:37 2000
+++ xterm-144-fixed/ptydata.c   Mon Sep  4 11:21:26 2000
@@ -134,6 +134,10 @@
                              screen->utf_char <<= 6;
                              screen->utf_char |= (c & 0x3f);
                            }
+                           if ((screen->utf_char >= 0xd800 && screen->utf_char <= 
+0xdfff) ||
+                               (screen->utf_char == 0xfffe) || (screen->utf_char == 
+0xffff)) {
+                             screen->utf_char = UCS_REPL;
+                           }
                            screen->utf_count--;
                            if (screen->utf_count == 0)
                                data->buf2[j++] = screen->utf_char;

Re: Updated UTF-8 decoder stress test file

Reply via email to