Hello,
I am new to this list so please forgive me I'm covering old ground.
I am interested in displaying some text in languages other than English within my application. However, I'm having some difficulty when trying to display non-ASCII characters. Note that I use UTF-8 to display all characters, even those that can be represented in 8 bits (0x00 - 0xFF).
For example, if I want to display the character 'á' (that's an 'a' with an acute accent in case it doesn't show up on your browser), that's U+00E1 in Unicode-speak. Encoding that character as UTF-8, it comes out to be 0xC3 0xA1. If, in my .po file (for the GNU gettext() utilities), I include the following:
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
...
#: TestProgram.cpp:145
msgid "it is"
msgstr "est\xC3\xA1"
what comes back is
est?
I know that the problem is not with text rendering as I can write the UTF-8 directly into the string in the program and it works fine, i.e. it displays the a with the accent.
Any ideas of what I might be doing wrong? Note that I also tried typing the C3 and A1 characters directly (á) but that also doesn't work.
ANOTHER PROBLEM: If I want to display the word "mañana" for example, I would encode it as "ma\xC3\xB1ana". However, the "\xB1a" is considered to be a single hex number! How can I indicate that I want the byte \xB1 followed by the letter 'a'. Remember, I can't use formatting strings because I'm working with gettext(). Surely somebody has run into this before!!
Thanks in advance.
Cheers,
Gil Glass
Telecom Field Services
JDSU
Germantown, MD, USA
+1-240-404-2551
- I18n, UTF-8, and Linux Gil Glass
- Re: I18n, UTF-8, and Linux Egmont Koblinger
- Re: I18n, UTF-8, and Linux Egmont Koblinger
- Re: I18n, UTF-8, and Linux Gil Glass
- Re: I18n, UTF-8, and Linux xerces8
- Re: I18n, UTF-8, and Linux Gil Glass
- Re: I18n, UTF-8, and Linux Edward H. Trager
