<URL: http://bugs.freeciv.org/Ticket/Display.html?id=40043 >

On Thu, 24 Jan 2008 William Allen Simpson wrote:

> Egor Vyscrebentsov wrote:
> > On Thu, 24 Jan 2008 William Allen Simpson wrote:
> >> The issue is not strlen().  It is properly used to determine the
> >> length of a string.  The issue is that strlen() has nothing to do with
> >> how much room there is for display in a particular window!
> > 
> > From what I can see everyday, you're wrong. strlen() will return 70,
> > while there are only, say, 40 multibyte characters.
> (heavy sigh) You are confusing strings with characters.  They are not the
> same....  Perhaps because of the unfortunately named "char" type, a 7-bit
> signed integer.  They have *never* been the same.  Over my life, I've
> programmed for 5-bit (Baudot code), 6-bit (CDC), 7-bit, 8-bit, 12-bit, and
> now multi-byte characters.
> For characters, we don't talk about "lengths", we talk about widths,
> either in pixels or points.

Ok, s/length/width/g. Width of string in alphabet characters.
Every time I said "character", I meant "alphabetic character that you
can see on display". For signed 7-bit integer I said "char".

Maybe I can't say what I want. Trying again:
strlen couldn't be used for receiving number of alphabetic characters
in the string. It can be used for receiving number of 7-bit signed integer.
And my point is that usage strlen to get number of "alphabetic characters"
IS and WILL BE an issue with UTF-8.
> > strlen() may be used to get size of string, but not length.
> Perhaps there must be some Russian differences in the meaning of the words
> size and length.  They are synonymous in English:

Length closer to width for us, yes. Size is closer to volume. Also,
size is what malloc use for me.

> man 3 strlen

man 3 wcslen

/* I'm able to make heavy sighs too */

> > Please note that I have mentioned server console. What should we use there?
> Nothing.  No (translated or otherwise) message to the console needs to be
> wrapped by the program.  That is handled by the console driver, as always,
> since the days of paper tape!

And it wraps not by the border of word. "As always, since..."
The purpose of wordwrap_string function is to make human-readable break
in string line, isn't it? (I do not say here if it right or not to make
such breaks.)

wordwrap_string asks for an argument of displayable width (called 'len').
And here means a number of alphabetic characters, but realization means
size of string in signed 7-bit integers which can differ from the number.
This is the bug I said about. You said about knowledge of room for display.
But this is another question! With UTF-8 you have characters [displayable
alphabetic characters] of non-fidex size (in memory terms).
So line of 80 displayable alphabetic characters can take from 80 char
[signed 7-bit integers] to ... (160 for russian, if there will be no
spaces, digits, latin letters.)

We have code assumed that every displayable alphabetic character
is equal to signed 7-bit integer. The wordwrap_string is just one
example. Are you sure there is no other? I'm not.

Thanks, evyscr

Freeciv-dev mailing list

Reply via email to