The answer is simple; Use UTF-8, and use it correctly.
If it goes wrong, then either something in your code, or in Net::Jabber or
Perl is going wrong too.
While neither Net:Jabber nor Perl has a very brilliant history in
supporting unicode, I think by now both are mature enough to assume (for
now) the error is in your code. For the rest I'll leave this one to the
Perl experts though..
On Thu, 29 Sep 2005 15:44:26 +0200, John Talbot <[EMAIL PROTECTED]>
wrote:
How is it that Jabber clients can send greek characters to other Jabber
clients and they display nicely? They do, somehow.
But when I try to get my perl script to send a greek character to a
Jabber server (and I believe I've tried everything everything under the
sun - details below), all I get on my Jabber client is weird accented
characters from the upper 8859-1 charset.
Here's what I tried, and you'll see the paradox:
Using a windows jabber client's raw XML entry (from an admin account), I
wrote:
SENT: <message to="192.168.1.100/announce/motd"><body>Ya sou (but in
greek letters)</body></message>
This makes the client show an announcement in perfectly readable Greek
characters.
However when I asked my Perl script to send the exact same command (with
"Ya sou" written both in ISO-GREEK and in UTF-8, two cases), then only
accented western european vowels appeared in the notification. In the
ISO-GREEK case, I got the same number of western accented letters as the
number of greek letters in Ya sou, in the UTF-8 case, I got double the
western letters.
This seems rather strange... because if during Pandion's raw XML entry,
the letters of "Ya sou" were sent neither in ISO-GREEK nor in UTF-8,
then how WERE they sent?
[Pause]
I just read a bit of the core XMPP protocol, it says an
xml:lang='language-code' attribute should be used right after
connection...
I'm using the Net::Jabber library for my Perl scripts. Could it be that
the library doesn't use xml:lang? (I have no way to check) Even if so, I
tried including an xml:lang='el' (that's the symbol for greek) attribute
in the <message> tag that Perl sends, and that didn't change a thing
(even though according to the core XMPP protocol it should have worked,
if that were the problem) neither with the UTF-8 version of the string
nor the ISO-GREEK one.
What do you think might be going on?
I could send you a trimmed version of the script if you need.
Many thanks,
John