> Now, Fabien has a hackaround for this which is checked into the 1.3 svn > for testing - it seems to work OK but I am a little concerned that it is > still not sophisticated enough. > > It would help *greatly* if everyone could give it a spin and report > their results (including the OP qiaogang chen if possible?) > The test code attached to STR 2080 is a reasonable starting place for > tests I think... But adding in any extra strings that are found to be > problematic (e.g. Greg's new Japanese samples) would be useful. I'll second that so please all: - Please help to provide *ANY* UTF8 content for which you experimented that one or more char/byte sequence are not drawn correctly, or you think is interesting to consider as test case (even if it works well now but has a complex multiple byte utf8 representation)
I think in particular of the mixed UTF8 / CP125x contents that Ian told me about. The current algorithm gives better results but is indeed not enough because: - some non completely utf8 conforming strings may still have glyphs not correctly interpreted : i.e: some 0xA0 char could be one of the utf8 sequence but still interpreted as non utf8 because the string may also contain CP125x extended chars, this is one example. That said, I am not keen for converting the string first to unicode and converting it back again to utf8, but would prefer instead a more efficient approach with an 'on the fly' utf8 byte seq. analysis, but this time on a per (variable length) char basis. I'll hopefully make a new increment in this direction so that it will be more obvious I hope :) After that change, even a mixed utf8 cp125x encoding shouldn't get the 0xA0 misinterpreted if it is part of a multibyte sequence. Fabien. _______________________________________________ fltk mailing list [email protected] http://lists.easysw.com/mailman/listinfo/fltk

