Hi,
I am using Kannel 1.4.1 on Linux RHEL4. I'm desperately
trying to find out exactly how kannel URL-encodes a
message after it receives it from a SMSC.
I have a receiver service configured like the following. I've
included all those parameters in the 'exec' line for debugging purposes.
group = sms-service
keyword = default
exec = /tmp/somescript.sh "p=%p" "P=%P" "i=%i" "a=%a" "b=%b" "t=%t"
"T=%T" "q=%q" "Q=%Q" "I=%I" "d=%d" "A=%A" "F=%F" "n=%n" "c=%c" "m=%m" "M=%M"
"C=%C" "u=%u" "B=%B" "o=%o" "O=%O" "f=%f" "k=%k"
"s=%s" "S=%S" "r=%r"
catch-all = true
When kannel receives a SMS message, and calls /tmp/somescript.sh,
does anyone exactly what determines how kannel URL-encodes
the message text? The kannel user guide says very little on this topic.
I would have expected that kannel URL-encodes the message according
to the ASCII character set (e.g. @ gets encoded as hex 40, $ gets encoded as
hex 24, etc.)
When kannel receives a message from kannel's fakesmsc, it does
URL-encode the message correctly this way.
But when I use a real phone handset to send to a real SMSC,
kannel URL-encodes the received message in a very weird way.
This is a big problem because we're trying to send the
message via standard HTTP POST to a web server to deal with it,
and it's not URL-encoded in the standard ASCII way.
I'm trying to send the following message to the above receiver service:
mykeyword starttest @ $ _ % < > ( ) endtest
When I use kannel's fakesmsc command to dispatch this message
/usr/bin/fakesmsc -H localhost -v 0 -i 1 -m 1 '0410000000 4444 text
mykeyword starttest @ $ _ % < > ( ) endtest'
the following appears in smsbox.log:
2009-04-08 15:03:53 [10216] [4] INFO: Starting to service <mykeyword
starttest @ $
_ % < > ( ) endtest> from <0410000000> to <4444>
2009-04-08 15:03:53 [10216] [4] DEBUG: executing sms-service
'/tmp/somescript.sh "p=0410000000" "P=4444"
"i=FAKESMSC" "a=mykeyword+starttest+%40+%24+_+%25+%3C+%3E+(+)+endtest"
"b=mykeyword+starttest+%40+%24+_+%25+%3C+%3E+(+)+endtest"
"t=2009-04-08+05:03:53"
"T=1239167033" "q=0410000000" "Q=4444"
"I=ef9a575b-fe29-4597-8079-1264085a3317" "d=-1" "A=" "F=%F" "n=mykeyword"
"c=-1"
"m=-1" "M=-1" "C=ISO-8859-1" "u=" "B=" "o=" "O=%O" "f=" "k=mykeyword"
"s=starttest" "S=@" "r=%24+_+%25+%3C+%3E+(+)+endtest"'
>From the above output, note how kannel URL-encodes the message:
"a=mykeyword+starttest+%40+%24+_+%25+%3C+%3E+(+)+endtest"
Kannel appears to correctly URL-encode it according to the ASCII
character set, i.e. @ gets encoded as hex 40, $ gets encoded as hex 24, etc.
But when I try sending the same message via my phone handset so that it
goes through our real SMSC, kannel URL-encodes it in a completely
different way:
"a=mykeyword+starttest+%A1+%A4+%A7+%25+%3C+%3E+(+)+endtest"
So @ gets encoded as %A1, $ gets encoded as %A4, etc.
What character set is this?? I can't figure it out.
The relevant entries in smsbox.log for this part are:
2009-04-08 15:23:59 [11757] [4] INFO: Starting to service <mykeyword
starttest
� � � % < > ( ) endtest> from <+61410000000> to <4444>
2009-04-08 15:23:59 [11757] [4] DEBUG: executing sms-service
'/tmp/somescript.sh "p=%2B61410000000"
"P=4444" "i=SMPP_MYSMPP"
"a=mykeyword+starttest+%A1+%A4+%A7+%25+%3C+%3E+(+)+endtest"
"b=mykeyword+starttest+%A1+%A4+%A7+%25+%3C+%3E+(+)+endtest"
"t=2009-04-08+05:23:59" "T=1239168239" "q=%2B61410000000" "Q=4444"
"I=01bbe9fb-a08b-43c8-a5c2-31bb113769cf" "d=-1" "A=" "F=%F" "n=mykeyword"
"c=0" "m=-1" "M=-1" "C=ISO-8859-1" "u=" "B=" "o=" "O=%O" "f=" "k=mykeyword"
"s=starttest" "S=�" "r=%A4+%A7+%25+%3C+%3E+(+)+endtest"'
I've included kannel's smsc log below, which shows exactly what came from
the SMSC. It shows that the SMSC sent the message to kannel using
the ASCII character set, i.e. @ came encoded as hex 40,
$ came encoded as hex 24, etc.
So what seems to be happening is:
- The SMSC sends kannel a message with the ASCII character set.
- Kannel then URL-encodes that message in a strange non-ASCII way when
doing 'exec' on my /tmp/somescript.sh script.
Is this some inherent problem with kannel? I find it hard to believe,
because every man and his dog would surely have found out by now that
kannel doesn't URL-encode simple characters properly, and so the bug
would have been fixed long ago.
I don't know much about SMS character encoding/character sets, but
I'm guessing the problem is related to that.
If I compare the two sets of smsbox.log entries above, the messages
from both the fakesmsc and real SMSC came with
%C (message charset) = ISO-8859-1
but the %c (message coding) was different between the two:
for the fakesmsc %c was -1, and for the real SMSC %c was -.
According to the kannel user guide %c=0 means "7 bits".
Previously the SMSC was sending messages to kannel in
the GSM 03.38 character set (I think??). Recently to fix some problems,
the SMSC guys reconfigured the SMSC so that now sends us messages
in the ASCII character set. This fixed some other problems we had,
but the above problems still remain.
Is there something that is still confusing kannel into thinking
that the SMSC is still sending us messages in GSM 03.38 format
for example? E.g. in the log below, the message is coming
with data_coding = 0 . Is that perhaps telling kannel
that it's in non-ASCII format?? If so, that might explain
why kannel then URL-encodes the message with strange values.
Or is there any way to force kannel to URL-encode received
messages in normal ASCII format? Or should some other changes be
done on the SMSC side?
Should I make use of kannel's alt-charset or alt-addr-charset
configuration settings?
Not sure if I'm on the right track or grasping at straws here.
kannel's smsc_mysmpp.log:
2009-04-08 15:23:59 [11720] [10] DEBUG: SMPP[SMPP_MYSMPP]: Got PDU:
2009-04-08 15:23:59 [11720] [10] DEBUG: SMPP PDU 0x90f05e8 dump:
2009-04-08 15:23:59 [11720] [10] DEBUG: type_name: deliver_sm
2009-04-08 15:23:59 [11720] [10] DEBUG: command_id: 5 = 0x00000005
2009-04-08 15:23:59 [11720] [10] DEBUG: command_status: 0 = 0x00000000
2009-04-08 15:23:59 [11720] [10] DEBUG: sequence_number: 1 = 0x00000001
2009-04-08 15:23:59 [11720] [10] DEBUG: service_type: NULL
2009-04-08 15:23:59 [11720] [10] DEBUG: source_addr_ton: 1 = 0x00000001
2009-04-08 15:23:59 [11720] [10] DEBUG: source_addr_npi: 1 = 0x00000001
2009-04-08 15:23:59 [11720] [10] DEBUG: source_addr: "61410000000"
2009-04-08 15:23:59 [11720] [10] DEBUG: dest_addr_ton: 0 = 0x00000000
2009-04-08 15:23:59 [11720] [10] DEBUG: dest_addr_npi: 4 = 0x00000004
2009-04-08 15:23:59 [11720] [10] DEBUG: destination_addr: "4444"
2009-04-08 15:23:59 [11720] [10] DEBUG: esm_class: 0 = 0x00000000
2009-04-08 15:23:59 [11720] [10] DEBUG: protocol_id: 0 = 0x00000000
2009-04-08 15:23:59 [11720] [10] DEBUG: priority_flag: 0 = 0x00000000
2009-04-08 15:23:59 [11720] [10] DEBUG: schedule_delivery_time: NULL
2009-04-08 15:23:59 [11720] [10] DEBUG: validity_period: NULL
2009-04-08 15:23:59 [11720] [10] DEBUG: registered_delivery: 0 =
0x00000000
2009-04-08 15:23:59 [11720] [10] DEBUG: replace_if_present_flag: 0 =
0x00000000
2009-04-08 15:23:59 [11720] [10] DEBUG: data_coding: 0 = 0x00000000
2009-04-08 15:23:59 [11720] [10] DEBUG: sm_default_msg_id: 0 = 0x00000000
2009-04-08 15:23:59 [11720] [10] DEBUG: sm_length: 43 = 0x0000002b
2009-04-08 15:23:59 [11720] [10] DEBUG: short_message:
2009-04-08 15:23:59 [11720] [10] DEBUG: Octet string at 0x90f06d8:
2009-04-08 15:23:59 [11720] [10] DEBUG: len: 43
2009-04-08 15:23:59 [11720] [10] DEBUG: size: 44
2009-04-08 15:23:59 [11720] [10] DEBUG: immutable: 0
2009-04-08 15:23:59 [11720] [10] DEBUG: data: 6d 79 6b 65 79 77 6f 72
64
20 73 74 61 72 74 74 mykeyword startt
2009-04-08 15:23:59 [11720] [10] DEBUG: data: 65 73 74 20 40 20 24 20
5f
20 25 20 3c 20 3e 20 est @ $ _ % < >
2009-04-08 15:23:59 [11720] [10] DEBUG: data: 28 20 29 20 65 6e 64 74
65
73 74 ( ) endtest
2009-04-08 15:23:59 [11720] [10] DEBUG: Octet string dump ends.
2009-04-08 15:23:59 [11720] [10] DEBUG: SMPP PDU dump ends.