Re: [fltk.development] problem with fl_draw(s,n,x,y)

Matthias Melcher Wed, 21 Sep 2011 06:59:46 -0700

On 21.09.2011, at 14:50, Nikita Egorov wrote:

>> Without looking into it in more detail, I'm going to say docs problem -
>> it should not say "n characters" but should probably say something like
>> "the number of bytes needed to represent n characters in UTF8" or some
>> such thing... Or...?
> Yes, description should be replaced to "number of bytes", but I want
> to work with characters.


The docs are wrong. fl_draw is a low level function and should not do things 
like character counting.

It would be your responsibility to convert the number of characters into the 
number of bytes. There are functions for that in fltk_utf... . It also makes 
sense, because a string that seems identical can be represented in various ways 
(for example, an umlaut can be composed from a " and the letter U, or it can 
use the umlaut glyph - both is perfectly legal). And only the caller can know 
how fl_draw is used. So, "bytes" would be correct.

>>> To solve the problem we can convert all string to UTF-16 and
>>> then restrict its by the specified length.
>> 
>> Which perhaps will not work either, since as soon as you hit any
>> character that is not on the BMP you will need a UTF16 surrogate pair,
>> and the same sort of problem occurs.
> 
> But in UTF-16 all symbols have size two bytes. There is no problem to
> set specified size of string as opposed to UTF8 where every symbol can
> have own size (from 1 up to 5?) .

No, in UTF-16, characters can be composed as well, plus all characters above 
0x7fff (IIRC) are represented by a four byte sequence, just like characters 
above 0x7f in UTF-8 are represented by longer sequences. Most Chinese 
characters will need four bytes in UTF-16, for example.

>> Indeed - the reason for having the "length" option in these methods is
>> explicitly so that the string does not need to be NULL terminated...
> 
> It is one more issue - the current implementation ignores NULL byte.
> It prints exactly N bytes from the source. Such behavior makes some
> problems for me too. May be we should check for zero char and stop
> printing the rest?

No. This function was written particularly so that a NUL can be printed. This 
may (or may not) be useful for certain terminal style text displays, who knows. 
But I do know that there was a reason to do it - back then.

>> Not keen, if we are to have a clean UTF8 API (and I think we should)
>> then...
> 
> What is the harm if we add one more function which accepts UTF16
> string ? In MS Win it would be part of the gd->draw(...) which would
> invoked after conversion.

It was a conscious and consensual decision to go with UTF-8. If we add a single 
UTF-16 call, we will need to provide more and more calls, and we will need to 
provide support calls for platforms that do not use UTF-16 (All Unixes, for 
example). This is certainly possible, but who will maintain the code? AFAIK 
there are two functions to convert to and from UTF-16 which you can use if you 
prefer UTF-16 when coding. Internally however, FLTK uses UTF-8 (even if it has 
to convert back to UTF-16 to print text).

>>> Because in order to evaluate right length of
>>> text, user has to convert source line to the UTF-16, restrict
>>> size and convert into UTF-8 again to invoke fl_draw(s,x,y)
>>> where string will be converted one more time.  Thus at the
>>> moment there is two unnecessary conversions.
>> 
>> Hmm, we have functions that tell you the number of "characters" in a
>> UTF8 string, do these not help?
> 
> I see no way to use it. Shortly - I have to print column of text lines
> restricted by specified length. I use the Courier font.
> The lines can contain any symbols - latin, cyrillic etc. I can't
> evaluate needed size of line in bytes if I use UTF8. Only via
> conversion into any accessible format with monosized symbols.

No, even if you use monospace fonts, you can not assume that the number of 
characters times the width of the font will give you the width of the string 
that will be rendered on screen. There are characters and character 
combinations in Unicode that need more or less pixels, even in monospaced 
fonts! There are even character sequences that have different width in 
different combinations, so simply adding up the width on each individual 
character will not work.

The only reliable way to get the width of whatever is printed is using 
fl_width() after setting the font and size.

Trust me, I know all that because I converted Fl_Text_Editor from ASCII to 
UTF-8, and doing this, I learned much more about Unicode than I ever wanted to 
know.

>> Anyway, best post an STR, and maybe a simple example we can use for
>> testing.
> 
> It's no problem, but... my latest STR (fl_draw_image_mono(...) has no
> body at all!) hang there without any interest

FLTK is an Open Source effort. We are really trying to keep the ball rolling, 
but we all have regular jobs and some of us even regular kids ;-) . Please be 
patient.  Our first goal is to keep FLTK1 as stable and as bug-free as 
possible. 

 - Matthias


_______________________________________________
fltk-dev mailing list
[email protected]
http://lists.easysw.com/mailman/listinfo/fltk-dev

Re: [fltk.development] problem with fl_draw(s,n,x,y)

Reply via email to