Hi, did you define 'split-chars' in your config and if so please post it here, so I can test it.
Giulio Harding wrote: > I'm kind of stumped with this problem, so I figured I'd write up a > summary of my investigation so far, in case someone else might have > an idea... > > On 23/01/2008, at 12:34 AM, Giulio Harding wrote: > >> I've actually had to back out of using the meta-data Kannel, as we >> spotted what looks to be a problem in the way Kannel handles >> carriage return characters (ctrl-M) in synchronous responses to MO >> HTTP requests - the carriage returns arrive at smsbox fine, but by >> the time they leave the bearerbox in an SMPP PDU, they somehow end >> up actually mangling the MT message body (text after the carriage >> returns is missing, subsequent text segments overwrite the start of >> the message body, etc.). Asynchronous MT HTTP requests aren't >> affected... >> >> Not 100% sure whether it's a Kannel problem or something else (site- >> specific) - I'll do a bit more investigation tomorrow morning and >> get back to you ASAP. > > > When we tested the patched TLV-enabled branch of Kannel (hereby > referred to as *NEW* Kannel) a few days ago, everything worked fine, > except that some messages coming from one particular application > appeared to be being mangled by Kannel - from our kannel traffic log: > > ... > 2008-01-22 03:10:27 SMS HTTP-request sender:XXX request: 'Azure > hotspot PIN^Mfrom Vodafone^MXXXXXXXX' url: 'http:// > 10.110.123.101:8404/? > msgData=Go&sourceAddr=XXX&channel=vodafonechmt&destinationAddr=19929873' > reply: 200 '<< successful >>' > 2008-01-22 03:10:28 Sent SMS [SMSC:vodafonechmt] [SVC:kannel_gw] > [ACT:] [BINF:] [from:19929873] [to:XXX] [flags:-1:0:-1:-1:-1] [msg: > 23:Azure hotspot PIN.from ] [udh:0:] [id: > 7936719e-1c7c-4e90-8161-6f14af4f23f7] > ... > > So, going in, the message is intact, but going out, it's had a '.' > character inserted, and seems to be truncated. > > This wasn't a problem with our original Kannel instance (hereby > referred to as *OLD* Kannel) - Kannel bearerbox version > `cvs-20060727'. Build `Apr 12 2007 17:01:36', compiler `3.4.6 > 20060404 (Red Hat 3.4.6-3)'. System Linux, release > 2.6.9-55.0.2.ELsmp, version #1 SMP Tue Jun 26 14:14:47 EDT 2007, > machine x86_64. Hostname smsgw2.appgw.mnetcorporation.com, IP > 10.110.123.31. Libxml version 2.6.16. Using native malloc. > > ... > 2008-01-22 00:43:57 SMS HTTP-request sender:XXX request: 'Azure > hotspot PIN^Mfrom Optus^MXXXXXXXX' url: 'http://10.110.123.101:8404/? > msgData=Go&sourceAddr=XXX&channel=optuschmt&destinationAddr=19929873' > reply: 200 '<< successful >>' > 2008-01-22 00:43:57 Sent SMS [SMSC:optuschmt] [SVC:kannel_gw] [ACT:] > [BINF:] [from:19929873] [to:XXX] [flags:-1:0:-1:-1:-1] [msg:37:Azure > hotspot PIN^Mfrom Optus^MXXXXXXXX] [udh:0:] [id:86dfd47e-0106-4700- > ab48-47381c43047c] > ... > > Also, part of the problem seems to be triggered by the presence of > the carriage return characters (^M) and part seems to be their > presence in the HTTP response to an MO request. > > In *OLD* Kannel, a separate MT request with the same text works fine: > > ... > 2008-01-23 23:21:12 send-SMS request added - sender:kannelpush:test > 10.110.123.101 target:XXX request: 'Azure hotspot PIN^Mfrom > Optus^MXXXXXXXXX' > 2008-01-23 23:21:13 Sent SMS [SMSC:mbloxdirect] [SVC:kannelpush] > [ACT:] [BINF:] [from:test] [to:XXX] [flags:-1:0:-1:-1:-1] [msg: > 37:Azure hotspot PIN^Mfrom Optus^MXXXXXXXX] [udh:0:] [id:09ba2907- > f013-46ff-9ec0-de006422d0b4] > ... > > But in *NEW* Kannel, a separate MT request with the same text still > has problems: it's not truncated any more, but it has '.' characters > instead of carriage returns: > > ... > 2008-01-22 13:49:55 send-SMS request added - sender:kannelpush: > 19996737 127.0.0.1 target:XXXX request: 'Azure hotspot PIN^Mfrom > Optus^MXXXXXXXX' > 2008-01-22 13:49:56 Sent SMS [SMSC:mbloxpsms] [SVC:kannelpush] [ACT:] > [BINF:] [from:19996737] [to:XXX] [flags:-1:0:-1:-1:-1] [msg:37:Azure > hotspot PIN.from Optus.XXXXXXXX] [udh:0:] > [id:d9308e85-4790-4714-9ac9-48a72159ac31] > ... > > So, with a few extra debug/trace statements, I was able to narrow > down the mangling as possibly occurring in the > extract_msgdata_part_by_coding function, in gw/sms.c: > > > > static Octstr *extract_msgdata_part_by_coding(Msg *msg, Octstr > *split_chars, > int max_part_len) > { > Octstr *temp = NULL, *temp_utf; > > if (msg->sms.coding == DC_8BIT || msg->sms.coding == DC_UCS2) { > /* nothing to do here, just call the original > extract_msgdata_part */ > return extract_msgdata_part(msg->sms.msgdata, split_chars, > max_part_len); > } > > /* convert to and the from gsm, so we drop all non GSM chars */ > charset_utf8_to_gsm(msg->sms.msgdata); > charset_gsm_to_utf8(msg->sms.msgdata); > > /* > * else we need to do something special. I'll just get > charset_gsm_truncate to > * cut the string to the required length and then count real > characters. > */ > temp = octstr_duplicate(msg->sms.msgdata); > charset_utf8_to_gsm(temp); > charset_gsm_truncate(temp, max_part_len); > > /* calculate utf-8 length */ > temp_utf = octstr_duplicate(temp); > charset_gsm_to_utf8(temp_utf); > max_part_len = octstr_len(temp_utf); > > octstr_destroy(temp); > octstr_destroy(temp_utf); > > /* now just call the original extract_msgdata_part with the new > length */ > return extract_msgdata_part(msg->sms.msgdata, split_chars, > max_part_len); > } > > > This has changed noticeably from our *OLD* Kannel: > > > static Octstr *extract_msgdata_part_by_coding(Msg *msg, Octstr > *split_chars, > int max_part_len) > { > Octstr *temp = NULL; > int pos, esc_count; > > if (msg->sms.coding == DC_8BIT || msg->sms.coding == DC_UCS2) { > /* nothing to do here, just call the original > extract_msgdata_part */ > return extract_msgdata_part(msg->sms.msgdata, split_chars, > max_part_len); > } > > /* > * else we need to do something special. I'll just get > charset_gsm_truncate to > * cut the string to the required length and then count real > characters. > */ > temp = octstr_duplicate(msg->sms.msgdata); > charset_latin1_to_gsm(temp); > charset_gsm_truncate(temp, max_part_len); > > pos = esc_count = 0; > > while ((pos = octstr_search_char(temp, 27, pos)) != -1) { > ++pos; > ++esc_count; > } > > octstr_destroy(temp); > > /* now just call the original extract_msgdata_part with the new > length */ > return extract_msgdata_part(msg->sms.msgdata, split_chars, > max_part_len - esc_count); > } > > > The biggest change is the use of UTF-8 as the internal encoding, I > think - however, one part of the mangling, the truncation, appears to > be happening in the extract_msgdata_part function. I jammed in some > extra traces, like so: > > > > static Octstr *extract_msgdata_part(Octstr *msgdata, Octstr > *split_chars, > int max_part_len) > { > debug("sms", 0, "X1: %s", octstr_get_cstr(msgdata)); > long i, len; > Octstr *part; > > debug("sms", 0, "split: %s", octstr_get_cstr(split_chars)); > debug("sms", 0, "max_part_len: %i", max_part_len); > len = max_part_len; > if (split_chars != NULL) { > debug("sms", 0, "split_chars not null"); > for (i = max_part_len; i > 0; i--) { > debug("sms", 0, "working back from max_part_len: %i", i); > if (octstr_search_char(split_chars, > octstr_get_char(msgdata, i - 1), > 0) != -1) { > debug("sms", 0, "something something, at char %i ->%c<-", i, > octstr_get_char(msgdata, i - 1)); > len = i; > break; > } > } > } > part = octstr_copy(msgdata, 0, len); > debug("sms", 0, "len: %i", len); > debug("sms", 0, "X2: %s", octstr_get_cstr(part)); > octstr_delete(msgdata, 0, len); > return part; > } > > > And got the following output: > > > 2008-01-23 22:31:34 [17246] [5] DEBUG: X1: Azure hotspot PIN^Mfrom > Vodafone^MXXXXXXXX > 2008-01-23 22:31:34 [17246] [5] DEBUG: split: > 2008-01-23 22:31:34 [17246] [5] DEBUG: max_part_len: 41 > 2008-01-23 22:31:34 [17246] [5] DEBUG: split_chars not null > 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from > max_part_len: 41 > 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from > max_part_len: 40 > 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from > max_part_len: 39 > 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from > max_part_len: 38 > 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from > max_part_len: 37 > 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from > max_part_len: 36 > 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from > max_part_len: 35 > 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from > max_part_len: 34 > 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from > max_part_len: 33 > 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from > max_part_len: 32 > 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from > max_part_len: 31 > 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from > max_part_len: 30 > 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from > max_part_len: 29 > 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from > max_part_len: 28 > 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from > max_part_len: 27 > 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from > max_part_len: 26 > 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from > max_part_len: 25 > 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from > max_part_len: 24 > 2008-01-23 22:31:34 [17246] [5] DEBUG: working back from > max_part_len: 23 > 2008-01-23 22:31:34 [17246] [5] DEBUG: something something, at char > 23 -> <- > 2008-01-23 22:31:34 [17246] [5] DEBUG: len: 23 > 2008-01-23 22:31:34 [17246] [5] DEBUG: X2: Azure hotspot PIN^Mfrom > 2008-01-23 22:31:34 [17246] [5] DEBUG: Extract: Azure hotspot PIN^Mfrom > 2008-01-23 22:31:34 [17246] [5] DEBUG: message length 41, sending 1 > messages > > > So, that loop appears to be truncating the message text - but I'm not > sure why. Can someone explain that code? Is that behaviour incorrect? > (It looks like it to me) > > As for the carriage returns becoming '.' characters, I haven't > located where that's occurring, but I'm wondering whether that might > be due to the switch to UTF8 internally? > > Anyway, like I said, I'm a bit out of my depth here - does anyone > have any ideas? Is anyone else able to replicate this strange > behaviour with carriage returns in the meta_data branch? (or even the > latest CVS snapshot, since this doesn't seem to be TLV-specific...) > > Thanks, > > -- > Giulio Harding > Systems Administrator > > m.Net Corporation > Level 2, 8 Leigh Street > Adelaide SA 5000, Australia > > Tel: +61 8 8210 2041 > Fax: +61 8 8211 9620 > Mobile: 0432 876 733 > Yahoo: giulio.harding > MSN: [EMAIL PROTECTED] > > http://www.mnetcorporation.com -- Thanks, Alex
