Re: Kannel meta-data (TLV) branch mangles MTs with carriage return characters received via HTTP response to MO?

Alexander Malysh Wed, 23 Jan 2008 05:57:24 -0800

Hi,

did you define 'split-chars' in your config and if so please post it here,
so I can test it.


Giulio Harding wrote:

> I'm kind of stumped with this problem, so I figured I'd write up a
> summary of my investigation so far, in case someone else might have
> an idea...
> 
> On 23/01/2008, at 12:34 AM, Giulio Harding wrote:
> 
>> I've actually had to back out of using the meta-data Kannel, as we
>> spotted what looks to be a problem in the way Kannel handles
>> carriage return characters (ctrl-M) in synchronous responses to MO
>> HTTP requests - the carriage returns arrive at smsbox fine, but by
>> the time they leave the bearerbox in an SMPP PDU, they somehow end
>> up actually mangling the MT message body (text after the carriage
>> returns is missing, subsequent text segments overwrite the start of
>> the message body, etc.). Asynchronous MT HTTP requests aren't
>> affected...
>>
>> Not 100% sure whether it's a Kannel problem or something else (site-
>> specific) - I'll do a bit more investigation tomorrow morning and
>> get back to you ASAP.
> 
> 
> When we tested the patched TLV-enabled branch of Kannel (hereby
> referred to as *NEW* Kannel) a few days ago, everything worked fine,
> except that some messages coming from one particular application
> appeared to be being mangled by Kannel - from our kannel traffic log:
> 
> ...
> 2008-01-22 03:10:27 SMS HTTP-request sender:XXX request: 'Azure
> hotspot PIN^Mfrom Vodafone^MXXXXXXXX' url: 'http://
> 10.110.123.101:8404/?
> msgData=Go&sourceAddr=XXX&channel=vodafonechmt&destinationAddr=19929873'
>   reply: 200 '<< successful >>'
> 2008-01-22 03:10:28 Sent SMS [SMSC:vodafonechmt] [SVC:kannel_gw]
> [ACT:] [BINF:] [from:19929873] [to:XXX] [flags:-1:0:-1:-1:-1] [msg:
> 23:Azure hotspot PIN.from ] [udh:0:] [id:
> 7936719e-1c7c-4e90-8161-6f14af4f23f7]
> ...
> 
> So, going in, the message is intact, but going out, it's had a '.'
> character inserted, and seems to be truncated.
> 
> This wasn't a problem with our original Kannel instance (hereby
> referred to as *OLD* Kannel) - Kannel bearerbox version
> `cvs-20060727'. Build `Apr 12 2007 17:01:36', compiler `3.4.6
> 20060404 (Red Hat 3.4.6-3)'. System Linux, release
> 2.6.9-55.0.2.ELsmp, version #1 SMP Tue Jun 26 14:14:47 EDT 2007,
> machine x86_64. Hostname smsgw2.appgw.mnetcorporation.com, IP
> 10.110.123.31. Libxml version 2.6.16. Using native malloc.
> 
> ...
> 2008-01-22 00:43:57 SMS HTTP-request sender:XXX request: 'Azure
> hotspot PIN^Mfrom Optus^MXXXXXXXX' url: 'http://10.110.123.101:8404/?
> msgData=Go&sourceAddr=XXX&channel=optuschmt&destinationAddr=19929873'
> reply: 200 '<< successful >>'
> 2008-01-22 00:43:57 Sent SMS [SMSC:optuschmt] [SVC:kannel_gw] [ACT:]
> [BINF:] [from:19929873] [to:XXX] [flags:-1:0:-1:-1:-1] [msg:37:Azure
> hotspot PIN^Mfrom Optus^MXXXXXXXX] [udh:0:] [id:86dfd47e-0106-4700-
> ab48-47381c43047c]
> ...
> 
> Also, part of the problem seems to be triggered by the presence of
> the carriage return characters (^M) and part seems to be their
> presence in the HTTP response to an MO request.
> 
> In *OLD* Kannel, a separate MT request with the same text works fine:
> 
> ...
> 2008-01-23 23:21:12 send-SMS request added - sender:kannelpush:test
> 10.110.123.101 target:XXX request: 'Azure hotspot PIN^Mfrom
> Optus^MXXXXXXXXX'
> 2008-01-23 23:21:13 Sent SMS [SMSC:mbloxdirect] [SVC:kannelpush]
> [ACT:] [BINF:] [from:test] [to:XXX] [flags:-1:0:-1:-1:-1] [msg:
> 37:Azure hotspot PIN^Mfrom Optus^MXXXXXXXX] [udh:0:] [id:09ba2907-
> f013-46ff-9ec0-de006422d0b4]
> ...
> 
> But in *NEW* Kannel, a separate MT request with the same text still
> has problems: it's not truncated any more, but it has '.' characters
> instead of carriage returns:
> 
> ...
> 2008-01-22 13:49:55 send-SMS request added - sender:kannelpush:
> 19996737 127.0.0.1 target:XXXX request: 'Azure hotspot PIN^Mfrom
> Optus^MXXXXXXXX'
> 2008-01-22 13:49:56 Sent SMS [SMSC:mbloxpsms] [SVC:kannelpush] [ACT:]
> [BINF:] [from:19996737] [to:XXX] [flags:-1:0:-1:-1:-1] [msg:37:Azure
> hotspot PIN.from Optus.XXXXXXXX] [udh:0:]
> [id:d9308e85-4790-4714-9ac9-48a72159ac31]
> ...
> 
> So, with a few extra debug/trace statements, I was able to narrow
> down the mangling as possibly occurring in the
> extract_msgdata_part_by_coding function, in gw/sms.c:
> 
> 
> 
> static Octstr *extract_msgdata_part_by_coding(Msg *msg, Octstr
> *split_chars,
>          int max_part_len)
> {
>      Octstr *temp = NULL, *temp_utf;
> 
>      if (msg->sms.coding == DC_8BIT || msg->sms.coding == DC_UCS2) {
>          /* nothing to do here, just call the original
> extract_msgdata_part */
>          return extract_msgdata_part(msg->sms.msgdata, split_chars,
> max_part_len);
>      }
> 
>      /* convert to and the from gsm, so we drop all non GSM chars */
>      charset_utf8_to_gsm(msg->sms.msgdata);
>      charset_gsm_to_utf8(msg->sms.msgdata);
> 
>      /*
>       * else we need to do something special. I'll just get
> charset_gsm_truncate to
>       * cut the string to the required length and then count real
> characters.
>       */
>       temp = octstr_duplicate(msg->sms.msgdata);
>       charset_utf8_to_gsm(temp);
>       charset_gsm_truncate(temp, max_part_len);
> 
>       /* calculate utf-8 length */
>       temp_utf = octstr_duplicate(temp);
>       charset_gsm_to_utf8(temp_utf);
>       max_part_len = octstr_len(temp_utf);
> 
>       octstr_destroy(temp);
>       octstr_destroy(temp_utf);
> 
>       /* now just call the original extract_msgdata_part with the new
> length */
>       return extract_msgdata_part(msg->sms.msgdata, split_chars,
> max_part_len);
> }
> 
> 
> This has changed noticeably from our *OLD* Kannel:
> 
> 
> static Octstr *extract_msgdata_part_by_coding(Msg *msg, Octstr
> *split_chars,
>          int max_part_len)
> {
>      Octstr *temp = NULL;
>      int pos, esc_count;
> 
>      if (msg->sms.coding == DC_8BIT || msg->sms.coding == DC_UCS2) {
>          /* nothing to do here, just call the original
> extract_msgdata_part */
>          return extract_msgdata_part(msg->sms.msgdata, split_chars,
> max_part_len);
>      }
> 
>      /*
>       * else we need to do something special. I'll just get
> charset_gsm_truncate to
>       * cut the string to the required length and then count real
> characters.
>       */
>       temp = octstr_duplicate(msg->sms.msgdata);
>       charset_latin1_to_gsm(temp);
>       charset_gsm_truncate(temp, max_part_len);
> 
>       pos = esc_count = 0;
> 
>       while ((pos = octstr_search_char(temp, 27, pos)) != -1) {
>          ++pos;
>           ++esc_count;
>       }
> 
>       octstr_destroy(temp);
> 
>       /* now just call the original extract_msgdata_part with the new
> length */
>       return extract_msgdata_part(msg->sms.msgdata, split_chars,
> max_part_len - esc_count);
> }
> 
> 
> The biggest change is the use of UTF-8 as the internal encoding, I
> think - however, one part of the mangling, the truncation, appears to
> be happening in the extract_msgdata_part function. I jammed in some
> extra traces, like so:
> 
> 
> 
> static Octstr *extract_msgdata_part(Octstr *msgdata, Octstr
> *split_chars,
>                                      int max_part_len)
> {
> debug("sms", 0, "X1: %s", octstr_get_cstr(msgdata));
>      long i, len;
>      Octstr *part;
> 
> debug("sms", 0, "split: %s", octstr_get_cstr(split_chars));
> debug("sms", 0, "max_part_len: %i", max_part_len);
>      len = max_part_len;
>      if (split_chars != NULL) {
> debug("sms", 0, "split_chars not null");
>          for (i = max_part_len; i > 0; i--) {
> debug("sms", 0, "working back from max_part_len: %i", i);
>              if (octstr_search_char(split_chars,
>                                     octstr_get_char(msgdata, i - 1),
> 0) != -1) {
> debug("sms", 0, "something something, at char %i ->%c<-", i,
> octstr_get_char(msgdata, i - 1));
>                  len = i;
>                  break;
>              }
>          }
>      }
>      part = octstr_copy(msgdata, 0, len);
> debug("sms", 0, "len: %i", len);
> debug("sms", 0, "X2: %s", octstr_get_cstr(part));
>      octstr_delete(msgdata, 0, len);
>      return part;
> }
> 
> 
> And got the following output:
> 
> 
> 2008-01-23 22:31:34 [17246] [5] DEBUG: X1: Azure hotspot PIN^Mfrom
> Vodafone^MXXXXXXXX
> 2008-01-23 22:31:34 [17246] [5] DEBUG: split:
> 2008-01-23 22:31:34 [17246] [5] DEBUG: max_part_len: 41
> 2008-01-23 22:31:34 [17246] [5] DEBUG: split_chars not null
> 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from
> max_part_len: 41
> 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from
> max_part_len: 40
> 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from
> max_part_len: 39
> 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from
> max_part_len: 38
> 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from
> max_part_len: 37
> 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from
> max_part_len: 36
> 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from
> max_part_len: 35
> 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from
> max_part_len: 34
> 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from
> max_part_len: 33
> 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from
> max_part_len: 32
> 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from
> max_part_len: 31
> 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from
> max_part_len: 30
> 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from
> max_part_len: 29
> 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from
> max_part_len: 28
> 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from
> max_part_len: 27
> 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from
> max_part_len: 26
> 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from
> max_part_len: 25
> 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from
> max_part_len: 24
> 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from
> max_part_len: 23
> 2008-01-23 22:31:34 [17246] [5] DEBUG: something something, at char
> 23 -> <-
> 2008-01-23 22:31:34 [17246] [5] DEBUG: len: 23
> 2008-01-23 22:31:34 [17246] [5] DEBUG: X2: Azure hotspot PIN^Mfrom
> 2008-01-23 22:31:34 [17246] [5] DEBUG: Extract: Azure hotspot PIN^Mfrom
> 2008-01-23 22:31:34 [17246] [5] DEBUG: message length 41, sending 1
> messages
> 
> 
> So, that loop appears to be truncating the message text - but I'm not
> sure why. Can someone explain that code? Is that behaviour incorrect?
> (It looks like it to me)
> 
> As for the carriage returns becoming '.' characters, I haven't
> located where that's occurring, but I'm wondering whether that might
> be due to the switch to UTF8 internally?
> 
> Anyway, like I said, I'm a bit out of my depth here - does anyone
> have any ideas? Is anyone else able to replicate this strange
> behaviour with carriage returns in the meta_data branch? (or even the
> latest CVS snapshot, since this doesn't seem to be TLV-specific...)
> 
> Thanks,
> 
> --
> Giulio Harding
> Systems Administrator
> 
> m.Net Corporation
> Level 2, 8 Leigh Street
> Adelaide SA 5000, Australia
> 
> Tel: +61 8 8210 2041
> Fax: +61 8 8211 9620
> Mobile: 0432 876 733
> Yahoo: giulio.harding
> MSN: [EMAIL PROTECTED]
> 
> http://www.mnetcorporation.com

-- 
Thanks,
Alex

Re: Kannel meta-data (TLV) branch mangles MTs with carriage return characters received via HTTP response to MO?

Reply via email to