Fwd: Bearerbox panic
I was quickly looking through the CVS code of today and wonder if this was maybe created by the fact that login is being rejected but enquire link tries to send its message anyway and then crashes. Could you tell me exactly on your CVS 20060727 version where is line 1811 in that version? On 24.08.2006, at 06:03, Mi Reflejo wrote: Hi, Which version do you use? If you are not using CVS HEAD please try it. Regards, Martin Conte. On 8/23/06, Stuart Beck [EMAIL PROTECTED] wrote: Hi All, We recently had a situation where one of the carriers we are connecting to (2 separate SMSC's) had some unspecified issue and reconnected, this however caused the following panic as one of the SMSC's had been turned off (upon communicating with the carrier it's been switched off for about a week, they are no longer supporting that service) I would like to know if anyone can give me any indication as to why the failure occurred and what can be done about it. The Kannel/host details are as follows Kannel bearerbox version `cvs-20060727'. Build `Aug 2 2006 12:31:01', compiler `3.2 20020903 (Red Hat Linux 8.0 3.2-7)'. System Linux, release 2.4.20-19.8smp, version #1 SMP Tue Jul 15 15:01:43 EDT 2003, machine i686. Libxml version 2.6.9. Using OpenSSL 0.9.6b [engine] 9 Jul 2001. Using native malloc. 2006-08-23 14:39:16 [32374] [38] ERROR: SMPP[hutchthree]: I/O error or other error. Re-connecting. 2006-08-23 14:39:16 [32374] [38] ERROR: SMPP[hutchthree]: Couldn't connect to SMS center (retrying in 10 seconds). 2006-08-23 14:39:16 [32448] [40] ERROR: SMPP[hutchorange]: I/O error or other error. Re-connecting. 2006-08-23 14:39:16 [32448] [40] ERROR: SMPP[hutchorange]: Couldn't connect to SMS center (retrying in 10 seconds). 2006-08-23 14:39:16 [32447] [39] ERROR: SMPP[hutchorange]: I/O error or other error. Re-connecting. 2006-08-23 14:39:16 [32447] [39] ERROR: SMPP[hutchorange]: Couldn't connect to SMS center (retrying in 10 seconds). 2006-08-23 14:39:16 [32373] [37] ERROR: SMPP[hutchthree]: I/O error or other error. Re-connecting. 2006-08-23 14:39:16 [32373] [37] ERROR: SMPP[hutchthree]: Couldn't connect to SMS center (retrying in 10 seconds). 2006-08-23 14:39:26 [32448] [40] WARNING: SMPP: PDU NULL terminated string has no NULL. 2006-08-23 14:39:26 [32448] [40] ERROR: SMPP[hutchorange]: SMSC rejected login to receive, code 0x000f (Invalid System ID). 2006-08-23 14:39:26 [32448] [40] ERROR: SMPP[hutchorange]: I/O error or other error. Re-connecting. 2006-08-23 14:39:26 [32447] [39] WARNING: SMPP: PDU NULL terminated string has no NULL. 2006-08-23 14:39:26 [32447] [39] ERROR: SMPP[hutchorange]: SMSC rejected login to transmit, code 0x000f (Invalid System ID). 2006-08-23 16:24:12 [4905] [3] INFO: HTTP: Re-starting smsc-id `hutchorange' 2006-08-23 16:24:12 [4905] [3] INFO: Set throughput to 13.000 for smsc id hutchorange 2006-08-23 16:24:12 [4905] [3] INFO: DLR rerouting for smsc id hutchorange disabled. 2006-08-23 16:24:13 [16991] [42] WARNING: SMPP: PDU NULL terminated string has no NULL. 2006-08-23 16:24:13 [16991] [42] ERROR: SMPP[hutchorange]: SMSC rejected login to receive, code 0x000f (Invalid System ID). 2006-08-23 16:24:13 [16991] [42] ERROR: SMPP[hutchorange]: I/O error or other error. Re-connecting. 2006-08-23 16:24:13 [16990] [41] WARNING: SMPP: PDU NULL terminated string has no NULL. 2006-08-23 16:24:13 [16990] [41] ERROR: SMPP[hutchorange]: SMSC rejected login to transmit, code 0x000f (Invalid System ID). 2006-08-23 16:24:38 [4905] [3] INFO: HTTP: Re-starting smsc-id `hutchorange' 2006-08-23 16:24:38 [4905] [3] INFO: Set throughput to 13.000 for smsc id hutchorange 2006-08-23 16:24:38 [4905] [3] INFO: DLR rerouting for smsc id hutchorange disabled. 2006-08-23 16:24:38 [17116] [44] WARNING: SMPP: PDU NULL terminated string has no NULL. 2006-08-23 16:24:38 [17116] [44] ERROR: SMPP[hutchorange]: SMSC rejected login to receive, code 0x000f (Invalid System ID). 2006-08-23 16:24:38 [17116] [44] ERROR: SMPP[hutchorange]: I/O error or other error. Re-connecting. 2006-08-23 16:24:38 [17115] [43] WARNING: SMPP: PDU NULL terminated string has no NULL. 2006-08-23 16:24:38 [17115] [43] ERROR: SMPP[hutchorange]: SMSC rejected login to transmit, code 0x000f (Invalid System ID). 2006-08-23 16:24:43 [16990] [41] PANIC: gwlib/octstr.c:2461: seems_valid_real: Assertion `ostr-len + 1 = ostr-size' failed. (Called from gw/smsc/smsc_smpp.c:1811:io_thread.) 2006-08-23 16:24:43 [16990] [41] PANIC: /opt/kannel/sbin/bearerbox(gw_panic+0xfd) [0x80b794d] 2006-08-23 16:24:43 [16990] [41] PANIC: /opt/kannel/sbin/bearerbox [0x80bcddc] 2006-08-23 16:24:43 [16990] [41] PANIC: /opt/kannel/sbin/bearerbox(octstr_get_cstr_real+0x20) [0x80b8bf0] 2006-08-23 16:24:43 [16990] [41] PANIC: /opt/kannel/sbin/bearerbox [0x8080b5e] 2006-08-23 16:24:43 [16990] [41] PANIC: /opt/kannel/sbin/bearerbox [0x80af177] 2006-08-23 16:24:43 [16990] [41] PANIC:
Re: Fwd: Bearerbox panic
Stuart's just left for the day, but I've picked out line 1811 in function io_thread in gw/smsc/smsc_smpp.c from the source code we're using: ... /* unbind * Read so long as unbind_resp received or timeout passed. Otherwise we have * double delivered messages. */ if (smpp-quitting) { send_unbind(smpp, conn); last_response = time(NULL); while(conn_wait(conn, 1.00) != -1 difftime(time(NULL), last_response) SMPP_DEFAULT_SHUTDOW N_TIMEOUT smpp-conn-status != SMSCCONN_DISCONNECTED) { if (read_pdu(smpp, conn, len, pdu) == 1) { dump_pdu(Got PDU:, smpp-conn-id, pdu); handle_pdu(smpp, conn, pdu, pending_submits); smpp_pdu_destroy(pdu); } } debug(bb.sms.smpp, 0, SMPP[%s]: %s: break and shutting down, octstr_get_cstr(smpp-conn-id), __func__); == Line1811 break; } send_enquire_link(smpp, conn, last_enquire_sent); ... Hope this helps... (just on a side-note, and this may be related: on the day this crash occured, later at night, we had an incident with the same carrier where one of our transmit binds to them died - they were rejecting our bind enquiries and then bind requests claiming we were already bound - when I tried to restart the individual link using the stop-smsc and start-smsc HTTP commands, Kannel hung, no longer processing MOs/MTs, and the admin page not responding. I tried to stop kannel, in order to restart it, but stopping it caused ~30 bearerbox processes to appear, each of which had to be kill -9'd before I could restart kannel. Not sure if this would related or not, but I haven't had a chance to investigate further... just thought I'd mention it in case...) Thanks, Andreas Fink wrote: I was quickly looking through the CVS code of today and wonder if this was maybe created by the fact that login is being rejected but enquire link tries to send its message anyway and then crashes. Could you tell me exactly on your CVS 20060727 version where is line 1811 in that version? On 24.08.2006, at 06:03, Mi Reflejo wrote: Hi, Which version do you use? If you are not using CVS HEAD please try it. Regards, Martin Conte. On 8/23/06, Stuart Beck [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Hi All, We recently had a situation where one of the carriers we are connecting to (2 separate SMSC's) had some unspecified issue and reconnected, this however caused the following panic as one of the SMSC's had been turned off (upon communicating with the carrier it's been switched off for about a week, they are no longer supporting that service) I would like to know if anyone can give me any indication as to why the failure occurred and what can be done about it. The Kannel/host details are as follows Kannel bearerbox version `cvs-20060727'. Build `Aug 2 2006 12:31:01', compiler `3.2 20020903 (Red Hat Linux 8.0 3.2-7)'. System Linux, release 2.4.20-19.8smp, version #1 SMP Tue Jul 15 15:01:43 EDT 2003, machine i686. Libxml version 2.6.9. Using OpenSSL 0.9.6b [engine] 9 Jul 2001. Using native malloc. 2006-08-23 14:39:16 [32374] [38] ERROR: SMPP[hutchthree]: I/O error or other error. Re-connecting. 2006-08-23 14:39:16 [32374] [38] ERROR: SMPP[hutchthree]: Couldn't connect to SMS center (retrying in 10 seconds). 2006-08-23 14:39:16 [32448] [40] ERROR: SMPP[hutchorange]: I/O error or other error. Re-connecting. 2006-08-23 14:39:16 [32448] [40] ERROR: SMPP[hutchorange]: Couldn't connect to SMS center (retrying in 10 seconds). 2006-08-23 14:39:16 [32447] [39] ERROR: SMPP[hutchorange]: I/O error or other error. Re-connecting. 2006-08-23 14:39:16 [32447] [39] ERROR: SMPP[hutchorange]: Couldn't connect to SMS center (retrying in 10 seconds). 2006-08-23 14:39:16 [32373] [37] ERROR: SMPP[hutchthree]: I/O error or other error. Re-connecting. 2006-08-23 14:39:16 [32373] [37] ERROR: SMPP[hutchthree]: Couldn't connect to SMS center (retrying in 10 seconds). 2006-08-23 14:39:26 [32448] [40] WARNING: SMPP: PDU NULL terminated string has no NULL. 2006-08-23 14:39:26 [32448] [40] ERROR: SMPP[hutchorange]: SMSC rejected login to receive, code 0x000f (Invalid System ID). 2006-08-23 14:39:26 [32448] [40] ERROR: SMPP[hutchorange]: I/O error or other error. Re-connecting. 2006-08-23 14:39:26 [32447] [39] WARNING: SMPP: PDU NULL terminated string has no NULL. 2006-08-23 14:39:26 [32447] [39] ERROR: SMPP[hutchorange]: SMSC rejected login to transmit, code 0x000f (Invalid System ID). 2006-08-23 16:24:12 [4905] [3] INFO: HTTP: Re-starting smsc-id `hutchorange' 2006-08-23 16:24:12 [4905] [3] INFO: Set throughput to 13.000 for