Hi,

cyrillic can only be send with ucs2 therefore coding=2.

Kannel behavior for coding=2 and 3 is simple: don't touch it it's binary and up 
to user to encode it BUT
if you need that kannel converts some charset to ucs2 for you then just use two 
params:
        charset=YOUR_CHARSET
        coding=2

Then kannel will do it for you.

Thanks,
Alex

Am 31.03.2012 um 00:45 schrieb chad selph:

> I understand that coding=2 stands for UCS-2 but the problem I'm pointing out 
> is that it doesn't actually re-encode the UTF8 bytes into actual UCS-2 bytes. 
>  This is inconsistent because it will convert utf8 to GSM, or to Latin-1 (if 
> the alt-charset is set to Latin1).
> 
> As far as the "charset" parameter: from my understand of the docs, it's 
> actually irrelevant to the SMPP stuff, this is just for you to tell smsbox 
> which percent encoding your text is in (URLs only support ascii).  It 
> defaults to UTF-8 in the newer versions and this is what prefer to use.  But 
> the important thing is that it has no relevance to the data_coding that gets 
> sent over SMPP.
> 
> 
> On Fri, Mar 30, 2012 at 3:20 PM, spameden <[email protected]> wrote:
> utf8 + coding=0 never worked for me for cyrillic text messages.
> 
> the only combination is coding=2 & charset=utf8, otherwise I'm getting 
> bollocks on mobile screen. 
> 
> according to the kannel's documentation, coding is:
> 
> coding number
> Optional. Sets the coding
> scheme bits in DCS field.
> Accepts values 0 to 2, for 7bit,
> 8bit or UCS-2. If unset, defaults
> to 7 bits unless a udh is defined,
> which sets coding to 8bits.
> 
> so coding=2 stands for UCS-2 message.
> 
> 
> 2012/3/31 chad selph <[email protected]>
> I'm trying to figure out how to send different data encodings from Kannel 
> 1.5.0 over SMPP.  The SMPP Spec lists the following options for data_coding 
> field:
> 
> 0 0 0 0 0 0 0 0 SMSC Default Alphabet
> 0 0 0 0 0 0 0 1 IA5(CCITTT.50)/ASCII(ANSIX3.4)
> 0 0 0 0 0 0 1 0 Octet unspecified (8-bit binary)
> 0 0 0 0 0 0 1 1 Latin1(ISO-8859-1)
> 0 0 0 0 0 1 0 0 Octet unspecified (8-bit binary)
> 0 0 0 0 0 1 0 1 JIS(X0208-1990)
> 0 0 0 0 0 1 1 0 Cyrllic(ISO-8859-5)
> 0 0 0 0 0 1 1 1 Latin/Hebrew (ISO-8859-8)
> 0 0 0 0 1 0 0 0 UCS2(ISO/IEC-10646)
> ... and some others.
> 
> To initiate MT messages, we're using the sendsms http interface on smsbox 
> (the one here: 
> http://www.kannel.org/download/1.5.0/userguide-1.5.0/userguide.html#AEN4623 
> ).  It looks like the only relevant parameter into the sendsms is the 
> "coding" parameter, which can only be 0, 1, or 2.  "0" causes data_coding 0, 
> 1 causes 4, and 2 causes 8.  I don't see a way to set data_coding to 3, for 
> example, in order to do Latin-1.
> 
> Another thing is that only 0 causes the message text to get encoded from 
> UTF-8 (input encoding from http) into the correct encoding.  For example, 
> sending the UTF-8 data with coding=2 does not re-encode the message into 
> USC-2, but just sends your UTF-8 bytes as if they were UCS-2 but sending utf8 
> data with coding=0 does re-encode them into GSM.
> 
> These things seem to me to be incorrect behavior, however given the wide use 
> of kannel I figured I should make sure I'm not missing something obvious 
> before I draft a patch to attempt to fix them.  Am I missing something?
> 
> 

Reply via email to