DO NOT REPLY TO THIS MESSAGE. INSTEAD, POST ANY RESPONSES TO THE LINK BELOW.
[STR New] Link: http://www.fltk.org/str.php?L2348 Version: 1.3-current Without commenting Manolo's proposal (looks interesting)... I remembered that I had read something about encoding unknown characters in the private use area of Unicode (U+E000 - U+F8FF), see chapter 16.5 of the Unicode standard <http://www.unicode.org/versions/Unicode5.2.0/ch16.pdf>. The idea: pick a contiguous range of 128 codepoints from the private use area (e.g. U+F880 - U+F88F), and encode each illegal _byte_ (range 0x80-0xff) as one Unicode codepoint U+F800 + <value of byte>. Example: Euro (â¬, 0x80 in CP1252) would be encoded as U+F880). We do only deal with single illegal bytes. These will be converted to legal UTF-8 encodings in the private use area. When saving the file, this process can be inverted. Maybe we would have to take special care of 0xfe and 0xff that would map to "non-character" Unicode codepoints, but we could also use 2 more allowed codepoints. Of course, these characters can't be displayed (they will probably be rendered as illegal characters in all fonts), but they can be identified and re-converted to the original encoding of the file. All internal UTF-8 functions _should_ be able to deal with them. Maybe a combination of some of the recent ideas/proposals with this encoding of illegal characters could make it. Link: http://www.fltk.org/str.php?L2348 Version: 1.3-current
_______________________________________________ fltk-bugs mailing list [email protected] http://lists.easysw.com/mailman/listinfo/fltk-bugs
