> Now, Fabien has a hackaround for this which is checked into the 1.3 svn
> for testing - it seems to work OK but I am a little concerned that it is
> still not sophisticated enough.
>
> It would help *greatly* if everyone could give it a spin and report
> their results (including the OP qiaogang chen if possible?)
> The test code attached to STR 2080 is a reasonable starting place for
> tests I think... But adding in any extra strings that are found to be
> problematic (e.g. Greg's new Japanese samples) would be useful.
I'll second that so please all:
   - Please help to provide *ANY* UTF8 content for which you experimented that 
one or more char/byte sequence are not drawn correctly, or you think is 
interesting to consider as test case (even if it works well now but has a 
complex multiple byte utf8 representation)

I think in particular of the mixed UTF8 / CP125x contents that Ian told me 
about.

The current algorithm gives better results but is indeed not enough because:
  - some non completely utf8 conforming strings may still have glyphs not 
correctly interpreted : i.e: some 0xA0 char could be one of the utf8 sequence 
but still interpreted as non utf8 because the string may also contain CP125x 
extended chars, this is one example.

That said, I am not keen for converting the string first to unicode and 
converting it back again to utf8, but would prefer instead a more efficient 
approach with an 'on the fly' utf8 byte seq. analysis, but this time on a per 
(variable length) char basis.
I'll hopefully make a new increment in this direction so that it will be more 
obvious I hope :)
After that change, even a mixed utf8 cp125x encoding shouldn't get the 0xA0 
misinterpreted if it is part of a multibyte sequence.

Fabien.

_______________________________________________
fltk mailing list
[email protected]
http://lists.easysw.com/mailman/listinfo/fltk

Reply via email to