Thanks to everyone for their replies!

Ok, so the consensus seems to be that this is a bug in kannel,
and that it still exists in the latest version.

We connect to a number of different SMSC's, so for each
one I have now split them up as you have suggested -
I have made a separate section for the transmitter port,
and a separate section for the receiver port (see below).

So this workaround now involves a lot of duplicated data,
but if it works and doesn't have any other side effects,
then all is good. So far at least, I haven't seen any
problems with this configuration - I expected that
kannel might barf if it has multiple definitions
for "smsc-id = SMSC7" and multiple things writing
to the same log file, but it seems ok thus far.

One nagging thought in my mind - this seems to be
a rather fundamental bug that I would expect a lot
of people would have run across. Not sure why they
haven't - is it because SMSC's don't go down very often?

Should this bug be documented prominently
in the kannel user's guide so that people are aware of it?
I couldn't see it documented anywhere, though maybe I
missed it somehow.

Here for example is how I have split up the configuration
for SMSC7 into separate port and receive-port sections.
I hope I've done it correctly.

group = smsc
smsc = smpp
smsc-id = SMSC7
host = x.x.x.x
port = 1000
smsc-username = ....
smsc-password = ....
system-type = smpp
address-range = ""
keepalive = 0
log-file = "/log/kannel/smsc7.log"
log-level = 0
msg-id-type = 0x01
throughput = 25
allowed-smsc-id = SMSC7

group = smsc
smsc = smpp
smsc-id = SMSC7
host = x.x.x.x
receive-port = 1000
smsc-username = ....
smsc-password = ....
system-type = smpp
address-range = ""
keepalive = 0
log-file = "/log/kannel/smsc7.log"
log-level = 0
msg-id-type = 0x01
throughput = 25
allowed-smsc-id = SMSC7

Thanks again to everyone! It's great to
get a bit of certainty back.

_______________
2009/5/20 Nikos Balkanas <[email protected]>

>  Hi,
>
> It is a bug even now, AFAIK. Unfortunately not easy to correct, and only
> for outdated SMScs. Tranceiver mode means that you can use a single
> connection for sending and receiving. Depends on your SMSC's SMPP version.
> It is not supported earlier than SMPP 3.4.
>
> The bug is due to the fact that one status is used internally to describe
> the SMSc. It can be either "inactive' or "active".  Assume that the send
> link goes down and cannot send any SMS. SMSc is marked momentarily
> "inactive". Normally this should be it, and kannel would try to reconnect.
> However the receive link, receive an enquire-link packet, which resets the
> SMSc status as "active". Therefore kannel doesn't try to reconnect.
>
> BR,
> Nikos
>
> ----- Original Message -----
> *From:* Donald Jackson <[email protected]>
> *To:* shaded 4 <[email protected]>
> *Cc:* [email protected]
> *Sent:* Friday, May 15, 2009 9:15 AM
> *Subject:* Re: Messages intermittently getting stuck in kannel's queue
>
> This is a bug in 1.4.1, I can't remember the exact details but its to do
> with the one of the 2 binds dropping.
>
> The solution to this problem is, if you have an smsc group with a tx/rx
> pair like:
>
> # TX/RX pair
> group=smsc
> smsc=smpp
> host=smpp.host.com
> port=2345
> receive-port=2345
> smsc-id=mysmsc
> ...
>
> You can split it into 2 SMSC's, like:
>
> # TX
> group=smsc
> smsc=smpp
> host=smpp.host.com
> port=2345
> smsc-id=mysmsc
> ...
>
> # RX
> group=smsc
> smsc=smpp
> host=smpp.host.com
> receive-port=2345
> smsc-id=mysmsc
> ...
>
> Hope this helps,
>
> Thanks,
> Donald
> http://www.ddj.co.za
>
> 2009/5/15 shaded 4 <[email protected]>
>
>> Thanks very much for your reply, Abdulmnem!
>>
>> I don't think that's the problem in this case.
>> From what I can tell from the logs, kannel successfully
>> binds to both the transmitter and the receiver streams,
>> and it shows apparently successful enquire_link/enquire_link_resp
>> messages going in both directions, every 30 seconds for each stream.
>>
>> (E.g. enquire_links for the transmitter at 4:30:03, 4:30:33, 4:31:03, etc
>>    and for the receiver at 4:30:11, 4:30:41, 4:31:11, etc.
>> )
>>
>> Any other suggestions from anyone?
>>
>> Although I think the problem lies elsewhere, can
>> anyone else verify Abdulmnem's suggestion that this might be
>> some bug in v1.4.1 ?
>> Has it been fixed in the more recent versions?
>> From http://kannel.org/news.shtml, I can't tell if it lists
>> this problem. I know in our environment it's a lot of
>> work and bureaucracy to upgrade versions of software,
>> so I don't want to do it unneccessarily.
>>
>> Pardon my ignorance, but does 'transceiver mode' mean that
>> it would connect to both the transmitter/receiver streams
>> via a single connection?
>>
>> We have always been using separate transmitter and receiver
>> connections, as the SMSC support team has told us in the past to do it
>> this way and to not use the transceiver mode (I'm not sure why -
>> I have no experience with SMSC's directly).
>>
>> But we don't need the receiver on this particular SMSC, and
>> I have now disabled it. I'm not really confident that this
>> will fix the problem, but we'll see. In any case, it certainly
>> won't hurt.
>>
>>
>>
>> Monim Benaiad wrote:
>> --------------------------------------------------
>>
>> > Hi,
>> >
>> > Is there any known problem in kannel such that it sometimes
>> > refuses to send messages? (I'm talking about just transmitter
>> > functionality here, not receiver.)
>> > It seems to happen sometimes after connectivity to a SMSC is
>> > disrupted and reconnected.
>> >
>> > The only way that kannel eventually sends the messages
>> > is if I restart kannel, or after it again loses and regains
>> > connectivity the next time.
>> > ...etc.
>> >
>>
>> AFAIK, The "transceiver-mode" variable in the smsc group will solve this
>> problem.
>> I think it's a bug in that version, because Kannel reconnect the receiver
>> only.
>> also remove the "receive-port".
>>
>> TIA
>>
>> Abdulmnem Benaiad
>> Almontaha
>> almontaha.com
>>
>> shaded4 wrote:
>> --------------------------------------------------
>>
>> Hi,
>>
>> Is there any known problem in kannel such that it sometimes
>> refuses to send messages? (I'm talking about just transmitter
>> functionality here, not receiver.)
>> It seems to happen sometimes after connectivity to a SMSC is
>> disrupted and reconnected.
>>
>> The only way that kannel eventually sends the messages
>> is if I restart kannel, or after it again loses and regains
>> connectivity the next time.
>>
>> We have been using kannel 1.4.1 for years. According to
>> http://kannel.org/news.shtml, it's a stable version which
>> has been out for some years before the very recent newer versions.
>>
>> I haven't seen this problem in production before (maybe because
>> connectivity to the SMSC's is rarely disrupted there), but
>> we have noticed it in our test environment where disruptions are more
>> common.
>>
>> What seems to happen:
>> - Kannel is working fine, sending messages to the correct SMSC's
>>     exactly as expected. So apparently we have it configured correctly.
>> - Connectivity to the SMSC's is temporarily lost for whatever reason.
>>     E.g. network connectivity slowness or outage, etc.
>> - When network connectivity is restored, kannel automatically reconnects
>>     (re-binds, if that's the correct terminology).
>> No problem so far.
>>
>> That's all ok, but sometimes any future messages targeted to some
>> particular SMSC (e.g. call it SMPP_SMSC1) now get stuck in kannel and
>> refuse to budge. Messages to other SMSC's still go through ok.
>>
>> This is what I observe when I try to subsequently send 17 messages to
>> kannel
>> which in normal circumstances always correctly route to SMPP_SMSC1.
>> I have all of kannel's logs enabled to maximum verbosity, i.e. log-level=0
>> .
>> - I see the messages go into kannel's store-file and not budging.
>> - The messages appear in smsbox's access-log and log-file, but they
>>     do NOT appear (yet) in the 'core' group's access-log .
>> - Kannel and the SMSC are constantly sending enquire_link and
>> enquire_link_resp
>>     messages every 30 seconds, which show that the connection is
>>     supposedly constantly active during this time.
>> - Running kannel's status snapshot (i.e. http://127.0.0.1:13000/status)
>>     shows the general SMS section reporting that there are 17 messages
>>     in the queued store, but the SMPP_SMSC1 line shows 0 queued messages.
>>     (Shouldn't that say 17 as well??)
>>
>>     --------------------------------------------------------------------
>>     SMS: received 95 (0 queued), sent 12626 (17 queued), store size 17
>>     SMSC connections:
>>         SMPP_SMSC1    SMPP:x.x.x.x:n/n:username1:smpp (online 74534s, rcvd
>> 0, sent 2615, failed 0, queued 0 msgs)
>>         SMPP_SMSC2    SMPP:x.x.x.x:n/n:username2:smpp (online 83104s, rcvd
>> 5, sent 21, failed 0, queued 0 msgs)
>>         ....etc.
>>     --------------------------------------------------------------------
>> - The 'core' group's log-file shows kannel going through all of the
>>     17 messages every 30 seconds, but frustratingly I can't find anything
>>     in any log file telling me WHY it's not sending the messages.
>>
>>     Can anybody tell me what is the significance of the rightmost numbers
>> on
>>     each line. E.g. 0x9a59e28 appears on each line - what does that mean?
>>     What does "0x9a96ba0 vs 0x9a59e28" mean?
>>     Are the message having too much fun playing soccer games against each
>> other?
>>
>>         2009-05-14 12:57:09 [29629] [23] DEBUG: sms_router: gwlist_len =
>> 17
>>         2009-05-14 12:57:09 [29629] [23] DEBUG: sms_router: handling
>> message (0x9a59e28 vs 0x9a59e28)
>>         2009-05-14 12:57:09 [29629] [23] DEBUG: sms_router: handling
>> message (0x9a96ba0 vs 0x9a59e28)
>>         ....etc.
>>         2009-05-14 12:57:09 [29629] [23] DEBUG: sms_router: handling
>> message (0x9a5c370 vs 0x9a59e28)
>>         2009-05-14 12:57:09 [29629] [23] DEBUG: sms_router: handling
>> message (0x9a59e28 vs 0x9a59e28)
>>         2009-05-14 12:57:09 [29629] [23] DEBUG: sms_router: time to sleep
>> 30.00 secs.
>>
>> The only thing that causes these messages to be eventually sent is:
>> - If I manually restart kannel.
>> - Or if there is another temporary disruption to the SMSC connectivity,
>>     and then kannel automatically reconnects (re-binds).
>>
>> As soon as that happens, kannel immediately sends all of the
>> stuck messages to the correct SMPP_SMSC1. It is also only at this time
>> that these messages are logged in the 'core' group's access-log .
>>
>> So why were the messages stuck in the first place?
>> If the enquire_link and enquire_link_resp messages show that
>> link is supposedly constantly up during this time,
>> why is kannel holding on to these messages?
>>
>> Two things I can think of:
>> a) I found a bug in kannel, e.g. some component of kannel might think
>>     that the link is still down.
>> b) *Something* might be telling kannel that the link is not really up
>>     or not healthy or something?? I don't know.
>>     Should I be looking for something special in the
>>     enquire_link messages, or in the initial bind messages perhaps?
>>     I can post logs if that helps.
>>
>> What also puzzles me is that we have a lot of SMSC's
>> configured in our kannel, but the problem most often
>> seems to occur with messages that we want to route to SMPP_SMSC1.
>> I _think_ it may have also happened with some of the other SMSC's, but
>> I'm not sure.
>>
>> What's slightly special about SMPP_SMSC1 is that it is
>> our 'default' SMSC.
>>
>> All of our other SMSC's like SMPP_SMSC2, SMPP_SMSC3, etc.
>> have a config like the following. The allowed-smsc-id line
>> means that only messages specified with 'smsc=SMPP_SMSC2' will
>> go through SMPP_SMSC2 for example.
>>
>> group = smsc
>> smsc = smpp
>> smsc-id = SMPP_SMSC2
>> host = ....
>> port = ....
>> receive-port = ....
>> smsc-username = ....
>> smsc-password = ....
>> system-type = smpp
>> address-range = ""
>> keepalive = 0
>> log-file = ....
>> log-level = 0
>> msg-id-type = 0x01
>> throughput = 25
>> allowed-smsc-id = SMPP_SMSC2
>> alt-charset = ASCII
>>
>> But SMPP_SMSC1 is our default SMSC. So it will
>> accept messages which specify one of
>> - smsc=SMPP_SMSC1
>> - smsc=jfdkljgltjhgtjljkgflkj (any rubbish value)
>> - No smsc setting
>> The denied-smsc-id line ensures that other
>> messages can't pass through this SMSC.
>>
>> group = smsc
>> smsc = smpp
>> smsc-id = SMPP_SMSC1
>> host = ....
>> port = ....
>> receive-port = ....
>> smsc-username = ....
>> smsc-password = ....
>> system-type = smpp
>> address-range = ""
>> keepalive = 0
>> log-file = ....
>> log-level = 0
>> msg-id-type = 0x01
>> throughput = 25
>> denied-smsc-id = SMPP_SMSC2;SMPP_SMSC3;SMPP_SMSC4;SMPP_SMSC5;....
>> alt-charset = ASCII
>>
>> Yes, all this normally works exactly as I expect.
>> So the messages aren't getting stuck due to some misconfiguration
>> as far as I know, but they get stuck intermittently.
>>
>>
>
>
> --
> Donald Jackson
> http://www.ddj.co.za/
> donaldjster(a)gmail.com
>
>

Reply via email to