Exactly what I've said :) If your source text is in utf8 you need to specify charset=utf8 and coding=2.
2012/4/2 Alexander Malysh <amal...@kannel.org> > Hi, > > cyrillic can only be send with ucs2 therefore coding=2. > > Kannel behavior for coding=2 and 3 is simple: don't touch it it's binary > and up to user to encode it BUT > if you need that kannel converts some charset to ucs2 for you then just > use two params: > charset=YOUR_CHARSET > coding=2 > > Then kannel will do it for you. > > Thanks, > Alex > > Am 31.03.2012 um 00:45 schrieb chad selph: > > I understand that coding=2 stands for UCS-2 but the problem I'm pointing > out is that it doesn't actually re-encode the UTF8 bytes into actual UCS-2 > bytes. This is inconsistent because it will convert utf8 to GSM, or to > Latin-1 (if the alt-charset is set to Latin1). > > As far as the "charset" parameter: from my understand of the docs, it's > actually irrelevant to the SMPP stuff, this is just for you to tell smsbox > which percent encoding your text is in (URLs only support ascii). It > defaults to UTF-8 in the newer versions and this is what prefer to use. > But the important thing is that it has no relevance to the data_coding > that gets sent over SMPP. > > > On Fri, Mar 30, 2012 at 3:20 PM, spameden <spame...@gmail.com> wrote: > >> utf8 + coding=0 never worked for me for cyrillic text messages. >> >> the only combination is coding=2 & charset=utf8, otherwise I'm getting >> bollocks on mobile screen. >> >> according to the kannel's documentation, coding is: >> >> coding number >> Optional. Sets the coding >> scheme bits in DCS field. >> Accepts values 0 to 2, for 7bit, >> 8bit or UCS-2. If unset, defaults >> to 7 bits unless a udh is defined, >> which sets coding to 8bits. >> >> so coding=2 stands for UCS-2 message. >> >> >> 2012/3/31 chad selph <chad.se...@gmail.com> >> >>> I'm trying to figure out how to send different data encodings from >>> Kannel 1.5.0 over SMPP. The SMPP Spec lists the following options for >>> data_coding field: >>> >>> 0 0 0 0 0 0 0 0 SMSC Default Alphabet >>> 0 0 0 0 0 0 0 1 IA5(CCITTT.50)/ASCII(ANSIX3.4) >>> 0 0 0 0 0 0 1 0 Octet unspecified (8-bit binary) >>> 0 0 0 0 0 0 1 1 Latin1(ISO-8859-1) >>> 0 0 0 0 0 1 0 0 Octet unspecified (8-bit binary) >>> 0 0 0 0 0 1 0 1 JIS(X0208-1990) >>> 0 0 0 0 0 1 1 0 Cyrllic(ISO-8859-5) >>> 0 0 0 0 0 1 1 1 Latin/Hebrew (ISO-8859-8) >>> 0 0 0 0 1 0 0 0 UCS2(ISO/IEC-10646) >>> ... and some others. >>> >>> To initiate MT messages, we're using the sendsms http interface on >>> smsbox (the one here: >>> http://www.kannel.org/download/1.5.0/userguide-1.5.0/userguide.html#AEN4623). >>> It looks like the only relevant parameter into the sendsms is the >>> "coding" parameter, which can only be 0, 1, or 2. "0" causes data_coding >>> 0, 1 causes 4, and 2 causes 8. I don't see a way to set data_coding to 3, >>> for example, in order to do Latin-1. >>> >>> Another thing is that only 0 causes the message text to get encoded from >>> UTF-8 (input encoding from http) into the correct encoding. For example, >>> sending the UTF-8 data with coding=2 does not re-encode the message into >>> USC-2, but just sends your UTF-8 bytes as if they were UCS-2 but sending >>> utf8 data with coding=0 does re-encode them into GSM. >>> >>> These things seem to me to be incorrect behavior, however given the wide >>> use of kannel I figured I should make sure I'm not missing something >>> obvious before I draft a patch to attempt to fix them. Am I missing >>> something? >>> >> >> > >