> Well... I still wonder if we need "src/xutf8/fl_wcwidth.c" at all?
>
> My preference would be to put that functionality into the existing
> file "src/fl_utf.c", which is kind of where I think it belongs
> anyway...
I kept it separate for testing purposes. fl_wcwidth() can certainly
be moved into src/fl_utf.c, and would avoid changes to Makefiles.
> In particular, that would mean that the fl_wcwidth() function would
> be able to take account of the macros (defined in "src/fl_utf.c")
> that define our utf8 error handling policy, in particular
> ERRORS_TO_CP1252 or STRICT_RFC3629.
>
> If we have ERRORS_TO_CP1252 set (it normally *is* set) in
> "src/fl_utf.c" then for the code points in the 0x80 to 0x9F range
> the fl_wcwidth() function would need to return (+1) whereas at
> present it will return (-1), which will not be what we want...
> Indications are that there's *a lot* of text out there that claims
> to be utf8 but actually uses that C1 control chars range (0x80 to
> 0x9F) as per the CP1252 characters, so we probably do want to do this.
>
> Thoughts?
Well, so far, for solving the STR-2158 problems at least, I had only
identified one place where fl_wcwidth() would be needed, and seeing
as you have to call fl_utf8decode() to get the ucs value to pass to
fl_wcwidth(), the extra testing is already done in fl_utf8decode().
[Replace mk_wcwidth() by fl_wcwidth() in the current code below.]
So as it stands now, the extra code is not needed in fl_wcwidth().
int Fl_Text_Buffer::character_width(const char *src, int indent, int tabDist)
{
char c = *src;
if ((c & 0x80) && (c & 0x40)) { // first byte of UTF-8 sequence
int len = fl_utf8len(c);
int ret = 0;
unsigned int ucs = fl_utf8decode(src, src+len, &ret);
int width = 1; // mk_wcwidth((wchar_t)ucs); // FIXME
return width;
}
if ((c & 0x80) && !(c & 0x40)) { // other byte of UTF-8 sequence
return 0;
}
return character_width(c, indent, tabDist);
}
[Hmm. Now I look at it, it probably makes sense to simplify this code
further by also providing an "fl_wcwidth(const char* src)" version
which does the decoding too.]
So, yes, we could add the extra code, just in case someone is working
with an array of UCS values, rather than UTF-8 encoded strings, but
that brings us back to yesterday's observation. FLTK should concern
itself only with teh display of UTF-8 characters, and let the user
implement UCS manipulation in icu4c or pango or whatever.
At least for FLTK-1.3...
Cheers
Duncan
_______________________________________________
fltk-dev mailing list
[email protected]
http://lists.easysw.com/mailman/listinfo/fltk-dev