Exactly what I've said :)

If your source text is in utf8 you need to specify charset=utf8 and
coding=2.

2012/4/2 Alexander Malysh <amal...@kannel.org>

> Hi,
>
> cyrillic can only be send with ucs2 therefore coding=2.
>
> Kannel behavior for coding=2 and 3 is simple: don't touch it it's binary
> and up to user to encode it BUT
> if you need that kannel converts some charset to ucs2 for you then just
> use two params:
> charset=YOUR_CHARSET
> coding=2
>
> Then kannel will do it for you.
>
> Thanks,
> Alex
>
> Am 31.03.2012 um 00:45 schrieb chad selph:
>
> I understand that coding=2 stands for UCS-2 but the problem I'm pointing
> out is that it doesn't actually re-encode the UTF8 bytes into actual UCS-2
> bytes.  This is inconsistent because it will convert utf8 to GSM, or to
> Latin-1 (if the alt-charset is set to Latin1).
>
> As far as the "charset" parameter: from my understand of the docs, it's
> actually irrelevant to the SMPP stuff, this is just for you to tell smsbox
> which percent encoding your text is in (URLs only support ascii).  It
> defaults to UTF-8 in the newer versions and this is what prefer to use.
>  But the important thing is that it has no relevance to the data_coding
> that gets sent over SMPP.
>
>
> On Fri, Mar 30, 2012 at 3:20 PM, spameden <spame...@gmail.com> wrote:
>
>> utf8 + coding=0 never worked for me for cyrillic text messages.
>>
>> the only combination is coding=2 & charset=utf8, otherwise I'm getting
>> bollocks on mobile screen.
>>
>> according to the kannel's documentation, coding is:
>>
>> coding number
>> Optional. Sets the coding
>> scheme bits in DCS field.
>> Accepts values 0 to 2, for 7bit,
>> 8bit or UCS-2. If unset, defaults
>> to 7 bits unless a udh is defined,
>> which sets coding to 8bits.
>>
>> so coding=2 stands for UCS-2 message.
>>
>>
>> 2012/3/31 chad selph <chad.se...@gmail.com>
>>
>>> I'm trying to figure out how to send different data encodings from
>>> Kannel 1.5.0 over SMPP.  The SMPP Spec lists the following options for
>>> data_coding field:
>>>
>>> 0 0 0 0 0 0 0 0 SMSC Default Alphabet
>>> 0 0 0 0 0 0 0 1 IA5(CCITTT.50)/ASCII(ANSIX3.4)
>>> 0 0 0 0 0 0 1 0 Octet unspecified (8-bit binary)
>>> 0 0 0 0 0 0 1 1 Latin1(ISO-8859-1)
>>> 0 0 0 0 0 1 0 0 Octet unspecified (8-bit binary)
>>> 0 0 0 0 0 1 0 1 JIS(X0208-1990)
>>> 0 0 0 0 0 1 1 0 Cyrllic(ISO-8859-5)
>>> 0 0 0 0 0 1 1 1 Latin/Hebrew (ISO-8859-8)
>>> 0 0 0 0 1 0 0 0 UCS2(ISO/IEC-10646)
>>> ... and some others.
>>>
>>> To initiate MT messages, we're using the sendsms http interface on
>>> smsbox (the one here:
>>> http://www.kannel.org/download/1.5.0/userguide-1.5.0/userguide.html#AEN4623).
>>>   It looks like the only relevant parameter into the sendsms is the
>>> "coding" parameter, which can only be 0, 1, or 2.  "0" causes data_coding
>>> 0, 1 causes 4, and 2 causes 8.  I don't see a way to set data_coding to 3,
>>> for example, in order to do Latin-1.
>>>
>>> Another thing is that only 0 causes the message text to get encoded from
>>> UTF-8 (input encoding from http) into the correct encoding.  For example,
>>> sending the UTF-8 data with coding=2 does not re-encode the message into
>>> USC-2, but just sends your UTF-8 bytes as if they were UCS-2 but sending
>>> utf8 data with coding=0 does re-encode them into GSM.
>>>
>>> These things seem to me to be incorrect behavior, however given the wide
>>> use of kannel I figured I should make sure I'm not missing something
>>> obvious before I draft a patch to attempt to fix them.  Am I missing
>>> something?
>>>
>>
>>
>
>

Reply via email to