Hi,
At the request of Hillel, I have agreed to update my patch for the
internal character set of smsbox/smpp, and post it here, hoping for it
to be committed to CVS.
It:
* Changes the default 7-bit character set of smsbox to windows-1252
instead of iso-8859-1, adding support for the euro-sign. (remember that
the latin1/gsm conversion functions already assumes windows-1252)
* smsbox uses charset_convert instead of octstr_recode, because the
latter will convert the euro-sign into a HTML entity.
* Changes the internal 7-bit character set of SMPP to windows-1252.
* Updates the documentation accordingly.
The primary effect of this patch should be support for the € sign in
both SMS transmission and reception (at least for gateways, which
utilizes the latin1/gsm conversion functions). For the rest, this should
have no effect since windows-1252 is identical to iso-8859-1 except for
0x80-0x9F which is unused in iso-8859-1.
Just to clarify: Unless the problem is in octstr_recode, this patch ONLY
adds support for the € (euro) sign. Other characters such as £ (pound)
also worked before. If a gateway didn't support £ before, it won't do it
now either. Besides, this patch does NOT add support for Greek GSM
characters!
--
Med venlig hilsen / Best regards
Peter Christensen
Developer
------------------
Cool Systems ApS
Tel: +45 2888 1600
@ : [EMAIL PROTECTED]
www: www.coolsystems.dk
diff -Nru gateway.old/doc/userguide/userguide.xml gateway.new/doc/userguide/userguide.xml
--- gateway.old/doc/userguide/userguide.xml 2006-05-09 16:32:31.000000000 +0200
+++ gateway.new/doc/userguide/userguide.xml 2006-05-10 15:07:17.000000000 +0200
@@ -5232,10 +5232,10 @@
<entry>boolean</entry>
<entry valign="bottom">
If enabled, Kannel will try to convert received messages with UCS-2 charset
- to ISO-8859-1 or to UTF-8, simplifying external servers jobs. If Kannel is
+ to WINDOWS-1252 or to UTF-8, simplifying external servers jobs. If Kannel is
able to recode message, it will also change <literal>coding</literal> to
<literal>7 bits</literal> and <literal>charset</literal> to
- <literal>iso-8859-1</literal> or to <literal>utf-8</literal>.
+ <literal>windows-1252</literal> or to <literal>utf-8</literal>.
</entry></row>
<row><entry><literal>http-request-retry</literal></entry>
@@ -5928,7 +5928,7 @@
message charset: for a "normal" message, it will
be "GSM" (coding=0), "binary" (coding=1) or
"UTF-16BE" (coding=2). If the message was successfully
-recoded from Unicode, it will be "ISO-8859-1"
+recoded from Unicode, it will be "WINDOWS-1252"
</entry></row>
<row><entry><literal>%u</literal></entry><entry>
@@ -6969,7 +6969,7 @@
<entry><literal>string</literal></entry>
<entry valign="bottom">
Charset of text message. Used to convert to a format suitable for
- 7 bits or to UCS-2. Defaults to ISO-8859-1 if coding is 7bits and
+ 7 bits or to UCS-2. Defaults to WINDOWS-1252 if coding is 7bits and
UTF-16BE if coding is UCS-2.
</entry></row>
diff -Nru gateway.old/gw/smsbox.c gateway.new/gw/smsbox.c
--- gateway.old/gw/smsbox.c 2006-05-09 19:42:51.000000000 +0200
+++ gateway.new/gw/smsbox.c 2006-05-10 15:09:31.000000000 +0200
@@ -3671,9 +3671,9 @@
if (coding == DC_7BIT) {
/*
- * For 7 bit, convert to ISO-8859-1
+ * For 7 bit, convert to WINDOWS-1252
*/
- if (octstr_recode (octstr_imm ("ISO-8859-1"), charset, body) < 0) {
+ if (charset_convert (body, octstr_get_cstr(charset), "WINDOWS-1252") < 0) {
resultcode = -1;
}
} else if (coding == DC_UCS2) {
diff -Nru gateway.old/gw/smsc/smsc_smpp.c gateway.new/gw/smsc/smsc_smpp.c
--- gateway.old/gw/smsc/smsc_smpp.c 2006-05-09 16:32:31.000000000 +0200
+++ gateway.new/gw/smsc/smsc_smpp.c 2006-05-10 15:07:58.000000000 +0200
@@ -506,9 +506,9 @@
* unless it was specified binary, ie. UDH indicator was detected
*/
if (smpp->alt_charset && msg->sms.coding != DC_8BIT) {
- if (charset_convert(msg->sms.msgdata, octstr_get_cstr(smpp->alt_charset), "ISO-8859-1") != 0)
+ if (charset_convert(msg->sms.msgdata, octstr_get_cstr(smpp->alt_charset), SMPP_DEFAULT_CHARSET) != 0)
error(0, "Failed to convert msgdata from charset <%s> to <%s>, will leave as is.",
- octstr_get_cstr(smpp->alt_charset), "ISO-8859-1");
+ octstr_get_cstr(smpp->alt_charset), SMPP_DEFAULT_CHARSET);
msg->sms.coding = DC_7BIT;
} else { /* assume GSM 03.38 7-bit alphabet */
charset_gsm_to_latin1(msg->sms.msgdata);
@@ -656,9 +656,9 @@
* unless it was specified binary, ie. UDH indicator was detected
*/
if (smpp->alt_charset && msg->sms.coding != DC_8BIT) {
- if (charset_convert(msg->sms.msgdata, octstr_get_cstr(smpp->alt_charset), "ISO-8859-1") != 0)
+ if (charset_convert(msg->sms.msgdata, octstr_get_cstr(smpp->alt_charset), SMPP_DEFAULT_CHARSET) != 0)
error(0, "Failed to convert msgdata from charset <%s> to <%s>, will leave as is.",
- octstr_get_cstr(smpp->alt_charset), "ISO-8859-1");
+ octstr_get_cstr(smpp->alt_charset), SMPP_DEFAULT_CHARSET);
msg->sms.coding = DC_7BIT;
} else { /* assume GSM 03.38 7-bit alphabet */
charset_gsm_to_latin1(msg->sms.msgdata);
@@ -879,10 +879,10 @@
/*
* convert to the given alternative charset
*/
- if (charset_convert(pdu->u.submit_sm.short_message, "ISO-8859-1",
+ if (charset_convert(pdu->u.submit_sm.short_message, SMPP_DEFAULT_CHARSET,
octstr_get_cstr(smpp->alt_charset)) != 0)
error(0, "Failed to convert msgdata from charset <%s> to <%s>, will send as is.",
- "ISO-8859-1", octstr_get_cstr(smpp->alt_charset));
+ SMPP_DEFAULT_CHARSET, octstr_get_cstr(smpp->alt_charset));
}
}