Stuart's just left for the day, but I've picked out line 1811 in
function io_thread in gw/smsc/smsc_smpp.c from the source code we're using:
...
/* unbind
* Read so long as unbind_resp received or timeout passed.
Otherwise
we have
* double delivered messages.
*/
if (smpp->quitting) {
send_unbind(smpp, conn);
last_response = time(NULL);
while(conn_wait(conn, 1.00) != -1 &&
difftime(time(NULL), last_response) <
SMPP_DEFAULT_SHUTDOW
N_TIMEOUT &&
smpp->conn->status != SMSCCONN_DISCONNECTED) {
if (read_pdu(smpp, conn, &len, &pdu) == 1) {
dump_pdu("Got PDU:", smpp->conn->id, pdu);
handle_pdu(smpp, conn, pdu, &pending_submits);
smpp_pdu_destroy(pdu);
}
}
debug("bb.sms.smpp", 0, "SMPP[%s]: %s: break and
shutting down",
octstr_get_cstr(smpp->conn->id), __func__);
<<<<================== Line1811
break;
}
send_enquire_link(smpp, conn, &last_enquire_sent);
...
Hope this helps...
(just on a side-note, and this may be related: on the day this crash
occured, later at night, we had an incident with the same carrier where
one of our transmit binds to them died - they were rejecting our bind
enquiries and then bind requests claiming we were already bound - when I
tried to restart the individual link using the stop-smsc and start-smsc
HTTP commands, Kannel hung, no longer processing MOs/MTs, and the admin
page not responding. I tried to stop kannel, in order to restart it, but
stopping it caused ~30 bearerbox processes to appear, each of which had
to be kill -9'd before I could restart kannel. Not sure if this would
related or not, but I haven't had a chance to investigate further...
just thought I'd mention it in case...)
Thanks,
Andreas Fink wrote:
I was quickly looking through the CVS code of today and wonder if this
was maybe created by the fact that login is being rejected but enquire
link tries to send its message anyway and then crashes.
Could you tell me exactly on your CVS 20060727 version where is line
1811 in that version?
On 24.08.2006, at 06:03, Mi Reflejo wrote:
Hi,
Which version do you use? If you are not using CVS HEAD please try it.
Regards,
Martin Conte.
On 8/23/06, Stuart Beck <[EMAIL PROTECTED]
<mailto:[EMAIL PROTECTED]>> wrote:
Hi All,
We recently had a situation where one of the carriers we are
connecting to (2 separate SMSC's) had some unspecified issue and
reconnected, this however caused the following panic
as one of the SMSC's had been turned off (upon communicating with
the carrier it's been switched off for about a week, they are no
longer supporting that service)
I would like to know if anyone can give me any indication as to why
the failure occurred and what can be done about it.
The Kannel/host details are as follows
Kannel bearerbox version `cvs-20060727'.
Build `Aug 2 2006 12:31:01', compiler `3.2 20020903 (Red Hat
Linux 8.0 3.2-7)'.
System Linux, release 2.4.20-19.8smp, version #1 SMP Tue Jul
15 15:01:43 EDT 2003, machine i686.
Libxml version 2.6.9. Using OpenSSL 0.9.6b [engine] 9 Jul
2001. Using native malloc.
> 2006-08-23 14:39:16 [32374] [38] ERROR: SMPP[hutchthree]: I/O
error or other error. Re-connecting.
> 2006-08-23 14:39:16 [32374] [38] ERROR: SMPP[hutchthree]: Couldn't
connect to SMS center (retrying in 10 seconds).
> 2006-08-23 14:39:16 [32448] [40] ERROR: SMPP[hutchorange]: I/O
error or other error. Re-connecting.
> 2006-08-23 14:39:16 [32448] [40] ERROR: SMPP[hutchorange]:
Couldn't connect to SMS center (retrying in 10 seconds).
> 2006-08-23 14:39:16 [32447] [39] ERROR: SMPP[hutchorange]: I/O
error or other error. Re-connecting.
> 2006-08-23 14:39:16 [32447] [39] ERROR: SMPP[hutchorange]:
Couldn't connect to SMS center (retrying in 10 seconds).
> 2006-08-23 14:39:16 [32373] [37] ERROR: SMPP[hutchthree]: I/O
error or other error. Re-connecting.
> 2006-08-23 14:39:16 [32373] [37] ERROR: SMPP[hutchthree]: Couldn't
connect to SMS center (retrying in 10 seconds).
> 2006-08-23 14:39:26 [32448] [40] WARNING: SMPP: PDU NULL
terminated string has no NULL.
> 2006-08-23 14:39:26 [32448] [40] ERROR: SMPP[hutchorange]: SMSC
rejected login to receive, code 0x0000000f (Invalid System ID).
> 2006-08-23 14:39:26 [32448] [40] ERROR: SMPP[hutchorange]: I/O
error or other error. Re-connecting.
> 2006-08-23 14:39:26 [32447] [39] WARNING: SMPP: PDU NULL
terminated string has no NULL.
> 2006-08-23 14:39:26 [32447] [39] ERROR: SMPP[hutchorange]: SMSC
rejected login to transmit, code 0x0000000f (Invalid System ID).
> 2006-08-23 16:24:12 [4905] [3] INFO: HTTP: Re-starting smsc-id
`hutchorange'
> 2006-08-23 16:24:12 [4905] [3] INFO: Set throughput to 13.000 for
smsc id <hutchorange>
> 2006-08-23 16:24:12 [4905] [3] INFO: DLR rerouting for smsc id
<hutchorange> disabled.
> 2006-08-23 16:24:13 [16991] [42] WARNING: SMPP: PDU NULL
terminated string has no NULL.
> 2006-08-23 16:24:13 [16991] [42] ERROR: SMPP[hutchorange]: SMSC
rejected login to receive, code 0x0000000f (Invalid System ID).
> 2006-08-23 16:24:13 [16991] [42] ERROR: SMPP[hutchorange]: I/O
error or other error. Re-connecting.
> 2006-08-23 16:24:13 [16990] [41] WARNING: SMPP: PDU NULL
terminated string has no NULL.
> 2006-08-23 16:24:13 [16990] [41] ERROR: SMPP[hutchorange]: SMSC
rejected login to transmit, code 0x0000000f (Invalid System ID).
> 2006-08-23 16:24:38 [4905] [3] INFO: HTTP: Re-starting smsc-id
`hutchorange'
> 2006-08-23 16:24:38 [4905] [3] INFO: Set throughput to 13.000 for
smsc id <hutchorange>
> 2006-08-23 16:24:38 [4905] [3] INFO: DLR rerouting for smsc id
<hutchorange> disabled.
> 2006-08-23 16:24:38 [17116] [44] WARNING: SMPP: PDU NULL
terminated string has no NULL.
> 2006-08-23 16:24:38 [17116] [44] ERROR: SMPP[hutchorange]: SMSC
rejected login to receive, code 0x0000000f (Invalid System ID).
> 2006-08-23 16:24:38 [17116] [44] ERROR: SMPP[hutchorange]: I/O
error or other error. Re-connecting.
> 2006-08-23 16:24:38 [17115] [43] WARNING: SMPP: PDU NULL
terminated string has no NULL.
> 2006-08-23 16:24:38 [17115] [43] ERROR: SMPP[hutchorange]: SMSC
rejected login to transmit, code 0x0000000f (Invalid System ID).
> 2006-08-23 16:24:43 [16990] [41] PANIC: gwlib/octstr.c:2461:
seems_valid_real: Assertion `ostr->len + 1 <= ostr->size' failed.
(Called from gw/smsc/smsc_smpp.c:1811:io_thread.)
> 2006-08-23 16:24:43 [16990] [41] PANIC:
/opt/kannel/sbin/bearerbox(gw_panic+0xfd) [0x80b794d]
> 2006-08-23 16:24:43 [16990] [41] PANIC: /opt/kannel/sbin/bearerbox
[0x80bcddc]
> 2006-08-23 16:24:43 [16990] [41] PANIC:
/opt/kannel/sbin/bearerbox(octstr_get_cstr_real+0x20) [0x80b8bf0]
> 2006-08-23 16:24:43 [16990] [41] PANIC: /opt/kannel/sbin/bearerbox
[0x8080b5e]
> 2006-08-23 16:24:43 [16990] [41] PANIC: /opt/kannel/sbin/bearerbox
[0x80af177]
> 2006-08-23 16:24:43 [16990] [41] PANIC: /lib/i686/libpthread.so.0
[0x400b1881]
> 2006-08-23 16:24:43 [16990] [41] PANIC:
/lib/i686/libc.so.6(__clone+0x57) [0x420e40c7]
--
Stuart Beck
Systems Administrator
m.Net Corporation
Level 13, 99 Gawler Place
Adelaide SA 5000, Australia
--
Giulio Harding
Systems Administrator
m.Net Corporation
Level 13, 99 Gawler Place
Adelaide SA 5000, Australia
Tel: +61 8 8210 2041
Fax: +61 8 8211 9620
Mobile: 0432 876 733
MSN: [EMAIL PROTECTED]
http://www.mnetcorporation.com