DO NOT REPLY TO THIS MESSAGE. INSTEAD, POST ANY RESPONSES TO THE LINK BELOW.
[STR New] Link: http://www.fltk.org/str.php?L2158 Version: 1.3-current I started to look at the wrap_mode(1,0) problem, and I think it will entail a lot more changes along the lines of what I did before. I see from the FLTK2 code that they work through the buffer one byte at a time, handle that byte, and then adjust the offset if that byte is the start of a UTF-8 sequence. On the one hand that's cleaner, but on the other it still doesn't address the column width problem because that requires knowing which UCS character we are dealing with. Therefore I either need to provide a whole series of additional routines that take a char* rather than a char, as I did above, and let each level extract the full UTF-8 byte sequence as needed, or I check for the UTF-8 sequence at the top level and then pass the UCS value rather than char to a series of new routines. Still haven't made up my mind on this one. On a related note, let's go back to wcwidth() and mk_wcwidth(). The Linux man page for wcwidth() says that it returns 0 for U+0000, the number of columns needed for printable wide characters, and -1 for non-printable characters. The behaviour also depends on LC_CTYPE. Markus Kuhn's implementation returns 0 for U+0000, only standard(?) control characters and DEL return -1, and all other return 0, 1, 2. There is no reference to locale specifics like LC_CTYPE in the code. If I build Markus Kuhn's implementation into FLTK-1.3.x xutf8 code and run the following: #include <wchar.h> #include <FL/Fl.H> #include <FL/fl_utf8.h> int main(int argc, char *argv[]) { for (wchar_t ucs = 0; ucs < 0xFFFF; ucs++) { int w1 = wcwidth(ucs); int w2 = mk_wcwidth(ucs); if (w1 != w2) printf("U+%04x: wcwidth()=%2d, mk_wcwidth()=%2d\n", ucs, w1, w2); } return 0; } I can see that there are a lot of characters that return -1 for the standard wcwidth(), but do return 0, 1, and 2 for mk_wcwidth(). I don't know whether the first is due to the limited number of locales that I have installed on this box or not. Although I have the feeling that providing a platform and locale(?) neutral implementation has its advantages, I wonder whether it might cause problems where wcwidth() is already being used elsewhere in the system. For example, where a system editor and the FLTK app show two different views of the same file. Should we be worried? And finally, as far as I can see, there's no reason why we can't just add Markus Kuhn's wcwidth.c to the xutf8 directory provided we keep the copyright etc. I don't think there's even a need to edit the file. Therefore, I have it all ready to be committed in the next few days. Any comments? Link: http://www.fltk.org/str.php?L2158 Version: 1.3-current _______________________________________________ fltk-bugs mailing list [email protected] http://lists.easysw.com/mailman/listinfo/fltk-bugs
