Ok. Before we think about real unicode, as we usually inject iso-8859-1 when
we are sending a message, if (and only if) you set mo-recode = true in
smsbox
group, smsbox will try to recode the ucs2 to iso-8859-1 when it receive a
message.

As usually, if you don't want it, just don't set mo-recode and everything
will
be just like before.

The patch was commited to cvs yesterday...

----- Original Message -----
From: "Oded Arbel" <[EMAIL PROTECTED]>
To: "Andreas Fink" <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Sent: Wednesday, March 06, 2002 9:40 AM
Subject: RE: [RFI] octstr_recode


> >>  Well what I'm trying to say is that Kannel should support Unicode
> >receiving but
> >>  keyword matching in smsbox against unicode is quite a headake.
> >
> >  not so - just convert it to utf-8 and strstr() it.
>
>
> Yes but how do you type a keyword in utf-8 in the config file if the
> config file is plain ascii?

Just type it using an utf-8 aware editor.

> the target is to match non ascii
> keywords. If the keyword would be ASCII then we woulndt have any
> Unicode.

of course you would - you forget that the ASCII chars in ISO-8859 and
single-byte chars in UTF-8 are bitwise identical. if you write a keyword
using "low ISO-8859-1" (characters where only 7 bits are in use) and try
to match it to the same word written in UTF-8, you'll have a match. OTOH
- if you wrote the keyword in mutli-byte UTF-8, one would expect that to
match this keyword one would have to send a SM in unicode.
note: ASCII (American Standard Code for Information Interchange) is
defined as 7 bit and often used as 8bit by adding a 0 bit at the front
of each char.

i.e. - as long as you use on the "low" range (0-127) characters of
ISO-8859-1 in the config file, it doesn't matter if its ASCII or UTF-8.
it only matter if you attempt to use weirder stuff (not sure where in
ISO-8859-8 are the accented chracters, but I guess that some of these do
fall in the "high" range). when that is the case- we should pick a
coding and stick to it : I'd suggest to go with UTF-8, otherwise we
either don't have unicode support, or we have a headache ;-)

--
Oded Arbel
m-Wise Inc.
[EMAIL PROTECTED]

"A Meltdown? One of those annoying buzzwords. We prefer to think of it
as an
unrequested fission surplus!"
-- Montegue Burns, (The Simpsons)




Reply via email to