Me:
>> In the misc/cp1252.txt example, these 0x80-0x9f bytes appear as
>> standalone bytes in the text. The Fl_Text_{Buffer,Display,Editor}
>> code iterates through arrays of bytes. If the top bit is not set,
>> it is plain old ascii: tabs, C0 control codes 0x01-0x1f and DEL 0x7fIan: > Hmmm - is Fl_Text_* using the functions from fl_utf.c etc? If so, > they should be "aware" of the "errors to cp1252" flag and do the > right thing... > > Or... does Fl_Text_* have its own implementations that don't know > about "errors to cp1252" - is that maybe what the problem is? I was about to answer "The problem is..." but then I though of some more and felt a Monty Python Spanish Inquisition moment coming on :-) The first problem here is that the various macros to handle extended mappings, ie. ERRORS_TO_ISO8859_1, ERRORS_TO_CP1252 and STRICT_RFC3629 only apply to the fl_utf8decode() function in fl_utf.c [from FLTK2 ?] The other functions there, e.g. fl_utf8fwd() and fl_utf8back() assume they have true utf-8 sequences only, and I don't think they will handle isolated CP1252 0x80-0x9f characters properly. But I need to check this. The second problem is that fl_utf.c isn't the only source of functions. There's also O'ksi'D's fl_utf8.cxx implementation, which includes the fl_utf8len(char c) function. As Bill pointed out, fl_utf8len() does not have the full context to determine whether a byte is a CP1252 0x80-0x9f byte or a utf-8 trailing byte. According to fl_utf8len() the length of a utf-8 trailing byte is -1. That's the issue here. The third problem: the Fl_Text_* code not only uses fl_utf8len() but also does a lot of its own bit testing against 0x80 and 0x40 masks, which muddies the waters rather a lot. And finally, the Fl_Text_* code is also doing some "smart" expansion of specific characters, tab to spaces, 0x01-0x1f and DEL 0x7f to readable mnemonic forms, and then trying to handle top-bit "utf-8" characters using fl_utf8len(). There's no testing for the CP1252 0x80-0x9f characters first, and the ERRORS_TO_CP1252 macro is not defined in these files anyway. And finally plus one :-) I haven't checked any of the text handling in any of the other widgets at all. Cheers Duncan _______________________________________________ fltk-dev mailing list [email protected] http://lists.easysw.com/mailman/listinfo/fltk-dev
