On Monday 2003.10.06 17:15:25 +0200, Marco Cimarosti wrote: > Stephane Bortzmeyer wrote: > > > OK. But the length in "characters" of a string is not > > "character semantics": > > > it's plain nonsense, IMHO. > > > > I disagree. > > Feel free. > > But I still don't see any use in knowing how many characters are in an UTF-8 > string, apart the use that I already mentioned: allocating a buffer for a > UTF-8 to UTF-32 conversion. > > _ Marco
Well, I know a good use for it: a console or terminal-based application which displays information using fixed-width fonts in a tabular form, such as a subset of records from a database table. To calculate how wide to display each column, knowing the maximum number of characters in the strings for each column is a useful starting place. Of course, that might not be enough by itself if, for example, (1) one has to display Hanzi or Kanji which are twice the width of Latin characters when displayed on a terminal, or (2) one has to display scripts where ligatures (as in Arabic) or other attributes of the script, such as over-the-letter/ under-the-letter vowels in Indic and Indic-derived scripts, change the display width of a string from what it would be if just counting characters. But it is still a good place to start.

