I think Chad was referring to "But the important thing is that it has no relevance to the data_coding that gets sent over SMPP" to the coding specified in the SMPP message... I've just checked in kannel smpp logs it sets data_coding: 8 = 0x00000008 when coding=2 specified.
2012/4/2 Alexander Malysh <[email protected]> > Then I don't understand what should be the issue here :-) ? > > Thanks, > Alex > > Am 01.04.2012 um 23:15 schrieb spameden: > > Exactly what I've said :) > > If your source text is in utf8 you need to specify charset=utf8 and > coding=2. > > 2012/4/2 Alexander Malysh <[email protected]> > >> Hi, >> >> cyrillic can only be send with ucs2 therefore coding=2. >> >> Kannel behavior for coding=2 and 3 is simple: don't touch it it's binary >> and up to user to encode it BUT >> if you need that kannel converts some charset to ucs2 for you then just >> use two params: >> charset=YOUR_CHARSET >> coding=2 >> >> Then kannel will do it for you. >> >> Thanks, >> Alex >> >> Am 31.03.2012 um 00:45 schrieb chad selph: >> >> I understand that coding=2 stands for UCS-2 but the problem I'm pointing >> out is that it doesn't actually re-encode the UTF8 bytes into actual UCS-2 >> bytes. This is inconsistent because it will convert utf8 to GSM, or to >> Latin-1 (if the alt-charset is set to Latin1). >> >> As far as the "charset" parameter: from my understand of the docs, it's >> actually irrelevant to the SMPP stuff, this is just for you to tell smsbox >> which percent encoding your text is in (URLs only support ascii). It >> defaults to UTF-8 in the newer versions and this is what prefer to use. >> But the important thing is that it has no relevance to the data_coding >> that gets sent over SMPP. >> >> >> On Fri, Mar 30, 2012 at 3:20 PM, spameden <[email protected]> wrote: >> >>> utf8 + coding=0 never worked for me for cyrillic text messages. >>> >>> the only combination is coding=2 & charset=utf8, otherwise I'm getting >>> bollocks on mobile screen. >>> >>> according to the kannel's documentation, coding is: >>> >>> coding number >>> Optional. Sets the coding >>> scheme bits in DCS field. >>> Accepts values 0 to 2, for 7bit, >>> 8bit or UCS-2. If unset, defaults >>> to 7 bits unless a udh is defined, >>> which sets coding to 8bits. >>> >>> so coding=2 stands for UCS-2 message. >>> >>> >>> 2012/3/31 chad selph <[email protected]> >>> >>>> I'm trying to figure out how to send different data encodings from >>>> Kannel 1.5.0 over SMPP. The SMPP Spec lists the following options for >>>> data_coding field: >>>> >>>> 0 0 0 0 0 0 0 0 SMSC Default Alphabet >>>> 0 0 0 0 0 0 0 1 IA5(CCITTT.50)/ASCII(ANSIX3.4) >>>> 0 0 0 0 0 0 1 0 Octet unspecified (8-bit binary) >>>> 0 0 0 0 0 0 1 1 Latin1(ISO-8859-1) >>>> 0 0 0 0 0 1 0 0 Octet unspecified (8-bit binary) >>>> 0 0 0 0 0 1 0 1 JIS(X0208-1990) >>>> 0 0 0 0 0 1 1 0 Cyrllic(ISO-8859-5) >>>> 0 0 0 0 0 1 1 1 Latin/Hebrew (ISO-8859-8) >>>> 0 0 0 0 1 0 0 0 UCS2(ISO/IEC-10646) >>>> ... and some others. >>>> >>>> To initiate MT messages, we're using the sendsms http interface on >>>> smsbox (the one here: >>>> http://www.kannel.org/download/1.5.0/userguide-1.5.0/userguide.html#AEN4623). >>>> It looks like the only relevant parameter into the sendsms is the >>>> "coding" parameter, which can only be 0, 1, or 2. "0" causes data_coding >>>> 0, 1 causes 4, and 2 causes 8. I don't see a way to set data_coding to 3, >>>> for example, in order to do Latin-1. >>>> >>>> Another thing is that only 0 causes the message text to get encoded >>>> from UTF-8 (input encoding from http) into the correct encoding. For >>>> example, sending the UTF-8 data with coding=2 does not re-encode the >>>> message into USC-2, but just sends your UTF-8 bytes as if they were UCS-2 >>>> but sending utf8 data with coding=0 does re-encode them into GSM. >>>> >>>> These things seem to me to be incorrect behavior, however given the >>>> wide use of kannel I figured I should make sure I'm not missing something >>>> obvious before I draft a patch to attempt to fix them. Am I missing >>>> something? >>>> >>> >>> >> >> > >
