Hi,
During my quite extensive use of the SMPP client in kannel, I have
stumbled upon a few bugs related to the character handling in smsbox and
the SMPP sources.
1. When sending an SMS consisting of a '€' (euro) sign through SMPP, the
character is replaced with € which is the unicode representation
of the character. Investigating the bug, I came to realize that
characters are converted into ISO-8859-1 before they are converted into
the GSM 03.38 charset, which obviously excludes the possibility for
transmitting the € sign. An interessting note however, is that the
latin1 to GSM conversion function uses CP1252 and thus HAVE support for
the €-sign.
2. As seen above, characters not convertable is translated into an XML
entity instead of a similar character, "?", or simply just being exluded
from the string. It seems that this problem is located in the smsbox code.
3. When sending an SMS with an alphanumeric originator, kannel does no
translation into any alternative character set, although the common
practice is to translate it into GSM 03.38 (1 char/octet).
Since the company I work at rely on these fuctionalities in kannel, it
is urgent that they are solved, and since I haven't got the luxury to
dig entirely into the kannel source, my attached patch (which solves
problem 1 and 3) is probably somewhat crude.
I've made a macro, defining the internal SMPP character set (just in
case), which is then used throughout the smsc_smpp.c file whenever
characters are converted. Besides, source_addr is parsed through the
charset_latin1_to_gsm function if is detected as being alphanumeric.
I will probably look at the xml entity stuff in a near future, but for
now, here is the patch fixing the SMPP charset probs. (for the latest CVS)
Best regards,
Peter Christensen
--- gateway/gw/smsc/smsc_smpp.c 2005-08-12 18:12:58.000000000 +0200
+++ gateway.new/gw/smsc/smsc_smpp.c 2005-08-18 18:01:57.055871145 +0200
@@ -78,6 +78,8 @@
#include "sms.h"
#include "dlr.h"
+#define SMPP_CHARSET "CP1252"
+
/*
* Select these based on whether you want to dump SMPP PDUs as they are
* sent and received or not. Not dumping should be the default in at least
@@ -389,6 +391,7 @@
break;
case GSM_ADDR_TON_ALPHANUMERIC:
+ charset_latin1_to_gsm(pdu->u.deliver_sm.source_addr);
if (octstr_len(pdu->u.deliver_sm.source_addr) > 11) {
/* alphanum sender, max. allowed length is 11 (according to GSM specs) */
error(0, "SMPP[%s]: Mallformed source_addr `%s', alphanum length greater 11 chars. "
@@ -493,9 +496,9 @@
* unless it was specified binary, ie. UDH indicator was detected
*/
if (smpp->alt_charset && msg->sms.coding != DC_8BIT) {
- if (charset_convert(msg->sms.msgdata, octstr_get_cstr(smpp->alt_charset), "ISO-8859-1") != 0)
+ if (charset_convert(msg->sms.msgdata, octstr_get_cstr(smpp->alt_charset), SMPP_CHARSET) != 0)
error(0, "Failed to convert msgdata from charset <%s> to <%s>, will leave as is.",
- octstr_get_cstr(smpp->alt_charset), "ISO-8859-1");
+ octstr_get_cstr(smpp->alt_charset), SMPP_CHARSET);
msg->sms.coding = DC_7BIT;
} else { /* assume GSM 03.38 7-bit alphabet */
charset_gsm_to_latin1(msg->sms.msgdata);
@@ -602,6 +605,7 @@
if (!octstr_check_range(pdu->u.submit_sm.source_addr, 1, 256, gw_isdigit)) {
pdu->u.submit_sm.source_addr_ton = GSM_ADDR_TON_ALPHANUMERIC; /* alphanum */
pdu->u.submit_sm.source_addr_npi = GSM_ADDR_NPI_UNKNOWN; /* short code */
+ charset_latin1_to_gsm(pdu->u.submit_sm.source_addr);
} else {
/* numeric sender address with + in front -> international (remove the +) */
octstr_delete(pdu->u.submit_sm.source_addr, 0, 1);
@@ -611,6 +615,7 @@
if (!octstr_check_range(pdu->u.submit_sm.source_addr,0, 256, gw_isdigit)) {
pdu->u.submit_sm.source_addr_ton = GSM_ADDR_TON_ALPHANUMERIC;
pdu->u.submit_sm.source_addr_npi = GSM_ADDR_NPI_UNKNOWN;
+ charset_latin1_to_gsm(pdu->u.submit_sm.source_addr);
}
}
}
@@ -700,10 +705,10 @@
/*
* convert to the given alternative charset
*/
- if (charset_convert(pdu->u.submit_sm.short_message, "ISO-8859-1",
+ if (charset_convert(pdu->u.submit_sm.short_message, SMPP_CHARSET,
octstr_get_cstr(smpp->alt_charset)) != 0)
error(0, "Failed to convert msgdata from charset <%s> to <%s>, will send as is.",
- "ISO-8859-1", octstr_get_cstr(smpp->alt_charset));
+ SMPP_CHARSET, octstr_get_cstr(smpp->alt_charset));
}
}