Kaixo!

On Sat, Jul 19, 2003 at 04:11:38AM -0700, Bill Kendrick wrote:

> In other words, I'm using "bind_textdomain_codeset("tuxpaint", "UTF8");"

(it would be better to use "UTF-8" instead, more standard)

> along with "TTF_RenderUTF8_Blending", and both Lithuanian and Spanish come
> out correct (whereas before, one would work, but not the other; and if
> I switched to "TTF_RenderText_Blended", the results would be the opposite)...

Yes.
As TTF_RenderUTF8_Blending exists and works well, there is no point in
using TTF_RenderText_Blended which is inherently limited.
Also, by forcing an internal charset encoding, we don't have to care about
system locale and things like that (and that is particularly important
as we are now in a transitional period, where half of programs/distributions
have switched to utf-8, and half haven't done yet...)
 
> I think it's all okay!  I think that happens is that gettext() actually
> CONVERTS the text to UTF8 for me, even though they're not UTF8 in the PO file.

gettext() always converts.
By default it converts to the charset encoding of the locale;
that is why a change in locale made the output work or fail.
With bind_textdomain_codeset() you tell gettext the charset to use
for converting to, instead of the locale encoding.
As a result, the program is now locale independent (for display at least)
 
> Now, telling gettext() "ALWAYS give me UTF8" (if I'm understanding the
> gettext() docs right :^) ), it should work for ALL languages, no matter the
> translation's encoding...

Yes(*).

It is what more and more modern programs do nowadays: they use unicode
internally and tell gettext to return strings in unicode.

I'm happy it now works like that.


Now, there is also the input side.
The input should be set to utf-8 too, in order to allow proper input
for non-ascii characters. I don't know how SDL does that however.

(*) well, no. Not /all/ languages, only those for which there is a one-to-one
relation between character and glyph. For indic languages, arabic, lao, etc.
changes to the rendering are needed.
on the other side, the languages with a one-to-one relation between character
and glyph are the majority; only arabic script and indic scripts and derived
(devanagari, tamil, kannada, oriya, gurmukhi, tibetan, malayalam, bengali,
thai, lao, khmer and maybe a couple more I forgot) need special handling.
thai and arabic could be more or less hacked to work; the solution for
the others will depend on SDL ability to handle them.

-- 
Ki �a vos v�ye b�n,
Pablo Saratxaga

http://chanae.walon.org/pablo/          PGP Key available, key ID: 0xD9B85466
[you can write me in Walloon, Spanish, French, English, Italian or Portuguese]

Attachment: pgp00000.pgp
Description: PGP signature

Reply via email to