DO NOT REPLY TO THIS MESSAGE. INSTEAD, POST ANY RESPONSES TO THE LINK BELOW.
[STR New] Link: http://www.fltk.org/str.php?L2348 Version: 1.3-current I can see four cases: 1: correct UTF-8 source text 2: ASCII with some encoding for characters 128 and above 3: some multibyte encoding 4: defective UTF-8 encoded text Loading any of these must be robust, FLTK must not crash. The UTF-8 functions are not very robust at all, so we must make sure that all text is always legal UTF-8. We can fix that relatively easily for 8 bit ascii (as seen in the recent patch). What should happen if the users tries to load text for any of case 2-4? a: don't load the text at all and output a message b: load the text, but warn that it contains unknown characters c: load the text (some or many characters may look wrong), and only warn if the user modifies or reads the text (sav, dnd, etc.) d: convert illegal UTF-8 sequences into the UTF-8 "illegal character" code, followed by an educated guess. This text may look odd, but it could be decoded and saved again without changes to the original text. It could even be edited, but non-ascii text would generate wrong codes. This is similar to writing \t for tab, \000 for nul, etc., only this is UTF-8. e: add a Fl_Text_Converter class hierarchy that offer conversion as in d, but can also be overridden to offer any other character encoding. Fl_Text_Converter -> Fl_Text_Converter_16bit -> Fl_Text_Converter_UTF16 -> Fl_Text_Converter_8bit -> Fl_Text_Converter_CP1512 -> Fl_Text_Converter_MacRoman ... I would suggest d for 1.3.0 and e for the next version. BTW, whichever we do, we probably ought to apply it to Fl_Input as well. Link: http://www.fltk.org/str.php?L2348 Version: 1.3-current _______________________________________________ fltk-bugs mailing list [email protected] http://lists.easysw.com/mailman/listinfo/fltk-bugs
