DO NOT REPLY TO THIS MESSAGE.  INSTEAD, POST ANY RESPONSES TO THE LINK BELOW.

[STR New]

Link: http://www.fltk.org/str.php?L2348
Version: 1.3-current


The problem, as discussed in the "Unicode character display page"
thread on fltk.development, is that Fl_Text_{Buffer,Display,Editor}
all use ad hoc testing of bytes against 0x80 and 0x40 masks to
determine whether to call fl_utf8len(char c) to give the number of
bytes in the current utf-8 character sequence. But fl_utf8len(char c)
does not have enough context to know whether bytes 0x80-0x9f are
utf-8 continuation bytes or single byte characters from CP1252 that
should be mapped. fl_utf8len() returns -1 for utf-8 continuation bytes,
which means that instead of stepping forward through the byte array,
Fl_Text_Display::expand_character() takes a step back and gets stuck
in an endless loop, processing the same characters over and over.

The solution appears to be to switch over to use fl_utf8fwd() and
fl_utf8back() to step forward and backward through the byte array.
These consider adjacent bytes to provide the full context, and call
fl_utf8decode() to determine the number of bytes in a character.
fl_utf8decode() also knows how to map CP1252 0x80-0x9f characters to
equivalent UCS values and hence utf-8 sequences. See utf8_fwbk_test.cxx

fl_utf8len() should be not be used for stepping through byte arrays.


Link: http://www.fltk.org/str.php?L2348
Version: 1.3-current

_______________________________________________
fltk-bugs mailing list
[email protected]
http://lists.easysw.com/mailman/listinfo/fltk-bugs

Reply via email to