The answer is simple; Use UTF-8, and use it correctly.

If it goes wrong, then either something in your code, or in Net::Jabber or Perl is going wrong too. While neither Net:Jabber nor Perl has a very brilliant history in supporting unicode, I think by now both are mature enough to assume (for now) the error is in your code. For the rest I'll leave this one to the Perl experts though..

On Thu, 29 Sep 2005 15:44:26 +0200, John Talbot <[EMAIL PROTECTED]> wrote:

How is it that Jabber clients can send greek characters to other Jabber clients and they display nicely? They do, somehow.

But when I try to get my perl script to send a greek character to a Jabber server (and I believe I've tried everything everything under the sun - details below), all I get on my Jabber client is weird accented characters from the upper 8859-1 charset.

Here's what I tried, and you'll see the paradox:

Using a windows jabber client's raw XML entry (from an admin account), I wrote:

SENT: <message to="192.168.1.100/announce/motd"><body>Ya sou (but in greek letters)</body></message>

This makes the client show an announcement in perfectly readable Greek characters.

However when I asked my Perl script to send the exact same command (with "Ya sou" written both in ISO-GREEK and in UTF-8, two cases), then only accented western european vowels appeared in the notification. In the ISO-GREEK case, I got the same number of western accented letters as the number of greek letters in Ya sou, in the UTF-8 case, I got double the western letters.

This seems rather strange... because if during Pandion's raw XML entry, the letters of "Ya sou" were sent neither in ISO-GREEK nor in UTF-8, then how WERE they sent?

[Pause]

I just read a bit of the core XMPP protocol, it says an xml:lang='language-code' attribute should be used right after connection...

I'm using the Net::Jabber library for my Perl scripts. Could it be that the library doesn't use xml:lang? (I have no way to check) Even if so, I tried including an xml:lang='el' (that's the symbol for greek) attribute in the <message> tag that Perl sends, and that didn't change a thing (even though according to the core XMPP protocol it should have worked, if that were the problem) neither with the UTF-8 version of the string nor the ISO-GREEK one.

What do you think might be going on?

I could send you a trimmed version of the script if you need.

Many thanks,
John



Reply via email to