This is a bug in 1.4.1, I can't remember the exact details but its to do with the one of the 2 binds dropping.
The solution to this problem is, if you have an smsc group with a tx/rx pair like: # TX/RX pair group=smsc smsc=smpp host=smpp.host.com port=2345 receive-port=2345 smsc-id=mysmsc ... You can split it into 2 SMSC's, like: # TX group=smsc smsc=smpp host=smpp.host.com port=2345 smsc-id=mysmsc ... # RX group=smsc smsc=smpp host=smpp.host.com receive-port=2345 smsc-id=mysmsc ... Hope this helps, Thanks, Donald http://www.ddj.co.za 2009/5/15 shaded 4 <[email protected]> > Thanks very much for your reply, Abdulmnem! > > I don't think that's the problem in this case. > From what I can tell from the logs, kannel successfully > binds to both the transmitter and the receiver streams, > and it shows apparently successful enquire_link/enquire_link_resp > messages going in both directions, every 30 seconds for each stream. > > (E.g. enquire_links for the transmitter at 4:30:03, 4:30:33, 4:31:03, etc > and for the receiver at 4:30:11, 4:30:41, 4:31:11, etc. > ) > > Any other suggestions from anyone? > > Although I think the problem lies elsewhere, can > anyone else verify Abdulmnem's suggestion that this might be > some bug in v1.4.1 ? > Has it been fixed in the more recent versions? > From http://kannel.org/news.shtml, I can't tell if it lists > this problem. I know in our environment it's a lot of > work and bureaucracy to upgrade versions of software, > so I don't want to do it unneccessarily. > > Pardon my ignorance, but does 'transceiver mode' mean that > it would connect to both the transmitter/receiver streams > via a single connection? > > We have always been using separate transmitter and receiver > connections, as the SMSC support team has told us in the past to do it > this way and to not use the transceiver mode (I'm not sure why - > I have no experience with SMSC's directly). > > But we don't need the receiver on this particular SMSC, and > I have now disabled it. I'm not really confident that this > will fix the problem, but we'll see. In any case, it certainly > won't hurt. > > > > Monim Benaiad wrote: > -------------------------------------------------- > > > Hi, > > > > Is there any known problem in kannel such that it sometimes > > refuses to send messages? (I'm talking about just transmitter > > functionality here, not receiver.) > > It seems to happen sometimes after connectivity to a SMSC is > > disrupted and reconnected. > > > > The only way that kannel eventually sends the messages > > is if I restart kannel, or after it again loses and regains > > connectivity the next time. > > ...etc. > > > > AFAIK, The "transceiver-mode" variable in the smsc group will solve this > problem. > I think it's a bug in that version, because Kannel reconnect the receiver > only. > also remove the "receive-port". > > TIA > > Abdulmnem Benaiad > Almontaha > almontaha.com > > shaded4 wrote: > -------------------------------------------------- > > Hi, > > Is there any known problem in kannel such that it sometimes > refuses to send messages? (I'm talking about just transmitter > functionality here, not receiver.) > It seems to happen sometimes after connectivity to a SMSC is > disrupted and reconnected. > > The only way that kannel eventually sends the messages > is if I restart kannel, or after it again loses and regains > connectivity the next time. > > We have been using kannel 1.4.1 for years. According to > http://kannel.org/news.shtml, it's a stable version which > has been out for some years before the very recent newer versions. > > I haven't seen this problem in production before (maybe because > connectivity to the SMSC's is rarely disrupted there), but > we have noticed it in our test environment where disruptions are more > common. > > What seems to happen: > - Kannel is working fine, sending messages to the correct SMSC's > exactly as expected. So apparently we have it configured correctly. > - Connectivity to the SMSC's is temporarily lost for whatever reason. > E.g. network connectivity slowness or outage, etc. > - When network connectivity is restored, kannel automatically reconnects > (re-binds, if that's the correct terminology). > No problem so far. > > That's all ok, but sometimes any future messages targeted to some > particular SMSC (e.g. call it SMPP_SMSC1) now get stuck in kannel and > refuse to budge. Messages to other SMSC's still go through ok. > > This is what I observe when I try to subsequently send 17 messages to > kannel > which in normal circumstances always correctly route to SMPP_SMSC1. > I have all of kannel's logs enabled to maximum verbosity, i.e. log-level=0 > . > - I see the messages go into kannel's store-file and not budging. > - The messages appear in smsbox's access-log and log-file, but they > do NOT appear (yet) in the 'core' group's access-log . > - Kannel and the SMSC are constantly sending enquire_link and > enquire_link_resp > messages every 30 seconds, which show that the connection is > supposedly constantly active during this time. > - Running kannel's status snapshot (i.e. http://127.0.0.1:13000/status) > shows the general SMS section reporting that there are 17 messages > in the queued store, but the SMPP_SMSC1 line shows 0 queued messages. > (Shouldn't that say 17 as well??) > > -------------------------------------------------------------------- > SMS: received 95 (0 queued), sent 12626 (17 queued), store size 17 > SMSC connections: > SMPP_SMSC1 SMPP:x.x.x.x:n/n:username1:smpp (online 74534s, rcvd > 0, sent 2615, failed 0, queued 0 msgs) > SMPP_SMSC2 SMPP:x.x.x.x:n/n:username2:smpp (online 83104s, rcvd > 5, sent 21, failed 0, queued 0 msgs) > ....etc. > -------------------------------------------------------------------- > - The 'core' group's log-file shows kannel going through all of the > 17 messages every 30 seconds, but frustratingly I can't find anything > in any log file telling me WHY it's not sending the messages. > > Can anybody tell me what is the significance of the rightmost numbers > on > each line. E.g. 0x9a59e28 appears on each line - what does that mean? > What does "0x9a96ba0 vs 0x9a59e28" mean? > Are the message having too much fun playing soccer games against each > other? > > 2009-05-14 12:57:09 [29629] [23] DEBUG: sms_router: gwlist_len = 17 > 2009-05-14 12:57:09 [29629] [23] DEBUG: sms_router: handling > message (0x9a59e28 vs 0x9a59e28) > 2009-05-14 12:57:09 [29629] [23] DEBUG: sms_router: handling > message (0x9a96ba0 vs 0x9a59e28) > ....etc. > 2009-05-14 12:57:09 [29629] [23] DEBUG: sms_router: handling > message (0x9a5c370 vs 0x9a59e28) > 2009-05-14 12:57:09 [29629] [23] DEBUG: sms_router: handling > message (0x9a59e28 vs 0x9a59e28) > 2009-05-14 12:57:09 [29629] [23] DEBUG: sms_router: time to sleep > 30.00 secs. > > The only thing that causes these messages to be eventually sent is: > - If I manually restart kannel. > - Or if there is another temporary disruption to the SMSC connectivity, > and then kannel automatically reconnects (re-binds). > > As soon as that happens, kannel immediately sends all of the > stuck messages to the correct SMPP_SMSC1. It is also only at this time > that these messages are logged in the 'core' group's access-log . > > So why were the messages stuck in the first place? > If the enquire_link and enquire_link_resp messages show that > link is supposedly constantly up during this time, > why is kannel holding on to these messages? > > Two things I can think of: > a) I found a bug in kannel, e.g. some component of kannel might think > that the link is still down. > b) *Something* might be telling kannel that the link is not really up > or not healthy or something?? I don't know. > Should I be looking for something special in the > enquire_link messages, or in the initial bind messages perhaps? > I can post logs if that helps. > > What also puzzles me is that we have a lot of SMSC's > configured in our kannel, but the problem most often > seems to occur with messages that we want to route to SMPP_SMSC1. > I _think_ it may have also happened with some of the other SMSC's, but > I'm not sure. > > What's slightly special about SMPP_SMSC1 is that it is > our 'default' SMSC. > > All of our other SMSC's like SMPP_SMSC2, SMPP_SMSC3, etc. > have a config like the following. The allowed-smsc-id line > means that only messages specified with 'smsc=SMPP_SMSC2' will > go through SMPP_SMSC2 for example. > > group = smsc > smsc = smpp > smsc-id = SMPP_SMSC2 > host = .... > port = .... > receive-port = .... > smsc-username = .... > smsc-password = .... > system-type = smpp > address-range = "" > keepalive = 0 > log-file = .... > log-level = 0 > msg-id-type = 0x01 > throughput = 25 > allowed-smsc-id = SMPP_SMSC2 > alt-charset = ASCII > > But SMPP_SMSC1 is our default SMSC. So it will > accept messages which specify one of > - smsc=SMPP_SMSC1 > - smsc=jfdkljgltjhgtjljkgflkj (any rubbish value) > - No smsc setting > The denied-smsc-id line ensures that other > messages can't pass through this SMSC. > > group = smsc > smsc = smpp > smsc-id = SMPP_SMSC1 > host = .... > port = .... > receive-port = .... > smsc-username = .... > smsc-password = .... > system-type = smpp > address-range = "" > keepalive = 0 > log-file = .... > log-level = 0 > msg-id-type = 0x01 > throughput = 25 > denied-smsc-id = SMPP_SMSC2;SMPP_SMSC3;SMPP_SMSC4;SMPP_SMSC5;.... > alt-charset = ASCII > > Yes, all this normally works exactly as I expect. > So the messages aren't getting stuck due to some misconfiguration > as far as I know, but they get stuck intermittently. > > -- Donald Jackson http://www.ddj.co.za/ donaldjster(a)gmail.com
