Hi,
cyrillic can only be send with ucs2 therefore coding=2.
Kannel behavior for coding=2 and 3 is simple: don't touch it it's binary and up
to user to encode it BUT
if you need that kannel converts some charset to ucs2 for you then just use two
params:
charset=YOUR_CHARSET
coding=2
Then kannel will do it for you.
Thanks,
Alex
Am 31.03.2012 um 00:45 schrieb chad selph:
> I understand that coding=2 stands for UCS-2 but the problem I'm pointing out
> is that it doesn't actually re-encode the UTF8 bytes into actual UCS-2 bytes.
> This is inconsistent because it will convert utf8 to GSM, or to Latin-1 (if
> the alt-charset is set to Latin1).
>
> As far as the "charset" parameter: from my understand of the docs, it's
> actually irrelevant to the SMPP stuff, this is just for you to tell smsbox
> which percent encoding your text is in (URLs only support ascii). It
> defaults to UTF-8 in the newer versions and this is what prefer to use. But
> the important thing is that it has no relevance to the data_coding that gets
> sent over SMPP.
>
>
> On Fri, Mar 30, 2012 at 3:20 PM, spameden <[email protected]> wrote:
> utf8 + coding=0 never worked for me for cyrillic text messages.
>
> the only combination is coding=2 & charset=utf8, otherwise I'm getting
> bollocks on mobile screen.
>
> according to the kannel's documentation, coding is:
>
> coding number
> Optional. Sets the coding
> scheme bits in DCS field.
> Accepts values 0 to 2, for 7bit,
> 8bit or UCS-2. If unset, defaults
> to 7 bits unless a udh is defined,
> which sets coding to 8bits.
>
> so coding=2 stands for UCS-2 message.
>
>
> 2012/3/31 chad selph <[email protected]>
> I'm trying to figure out how to send different data encodings from Kannel
> 1.5.0 over SMPP. The SMPP Spec lists the following options for data_coding
> field:
>
> 0 0 0 0 0 0 0 0 SMSC Default Alphabet
> 0 0 0 0 0 0 0 1 IA5(CCITTT.50)/ASCII(ANSIX3.4)
> 0 0 0 0 0 0 1 0 Octet unspecified (8-bit binary)
> 0 0 0 0 0 0 1 1 Latin1(ISO-8859-1)
> 0 0 0 0 0 1 0 0 Octet unspecified (8-bit binary)
> 0 0 0 0 0 1 0 1 JIS(X0208-1990)
> 0 0 0 0 0 1 1 0 Cyrllic(ISO-8859-5)
> 0 0 0 0 0 1 1 1 Latin/Hebrew (ISO-8859-8)
> 0 0 0 0 1 0 0 0 UCS2(ISO/IEC-10646)
> ... and some others.
>
> To initiate MT messages, we're using the sendsms http interface on smsbox
> (the one here:
> http://www.kannel.org/download/1.5.0/userguide-1.5.0/userguide.html#AEN4623
> ). It looks like the only relevant parameter into the sendsms is the
> "coding" parameter, which can only be 0, 1, or 2. "0" causes data_coding 0,
> 1 causes 4, and 2 causes 8. I don't see a way to set data_coding to 3, for
> example, in order to do Latin-1.
>
> Another thing is that only 0 causes the message text to get encoded from
> UTF-8 (input encoding from http) into the correct encoding. For example,
> sending the UTF-8 data with coding=2 does not re-encode the message into
> USC-2, but just sends your UTF-8 bytes as if they were UCS-2 but sending utf8
> data with coding=0 does re-encode them into GSM.
>
> These things seem to me to be incorrect behavior, however given the wide use
> of kannel I figured I should make sure I'm not missing something obvious
> before I draft a patch to attempt to fix them. Am I missing something?
>
>