Fwd: Bearerbox panic

2006-08-24 Thread Andreas Fink
I was quickly looking through the CVS code of today and wonder if this was maybe created by the fact that login is being rejected but enquire link tries to send its message anyway and then crashes.  Could you tell me exactly on your CVS 20060727 version where is line 1811 in that version?   On 24.08.2006, at 06:03, Mi Reflejo wrote:  Hi, Which version do you use? If you are not using CVS HEAD please try it.  Regards, Martin Conte.  On 8/23/06, Stuart Beck [EMAIL PROTECTED] wrote: Hi All,  We recently had a situation where one of the carriers we are connecting to (2 separate SMSC's) had some unspecified issue and reconnected, this however caused the following panic as one of the SMSC's had been turned off (upon communicating with the carrier it's been switched off for about a week, they are no longer supporting that service)  I would like to know if anyone can give me any indication as to why the failure occurred and what can be done about it.  The Kannel/host details are as follows         Kannel bearerbox version `cvs-20060727'.         Build `Aug 2 2006 12:31:01', compiler `3.2 20020903 (Red Hat Linux 8.0 3.2-7)'.         System Linux, release 2.4.20-19.8smp, version #1 SMP Tue Jul 15 15:01:43 EDT 2003, machine i686.         Libxml version 2.6.9. Using OpenSSL 0.9.6b [engine] 9 Jul 2001. Using native malloc.   2006-08-23 14:39:16 [32374] [38] ERROR: SMPP[hutchthree]: I/O error or other error. Re-connecting.  2006-08-23 14:39:16 [32374] [38] ERROR: SMPP[hutchthree]: Couldn't connect to SMS center (retrying in 10 seconds).  2006-08-23 14:39:16 [32448] [40] ERROR: SMPP[hutchorange]: I/O error or other error. Re-connecting.  2006-08-23 14:39:16 [32448] [40] ERROR: SMPP[hutchorange]: Couldn't connect to SMS center (retrying in 10 seconds).  2006-08-23 14:39:16 [32447] [39] ERROR: SMPP[hutchorange]: I/O error or other error. Re-connecting.  2006-08-23 14:39:16 [32447] [39] ERROR: SMPP[hutchorange]: Couldn't connect to SMS center (retrying in 10 seconds).  2006-08-23 14:39:16 [32373] [37] ERROR: SMPP[hutchthree]: I/O error or other error. Re-connecting.  2006-08-23 14:39:16 [32373] [37] ERROR: SMPP[hutchthree]: Couldn't connect to SMS center (retrying in 10 seconds).  2006-08-23 14:39:26 [32448] [40] WARNING: SMPP: PDU NULL terminated string has no NULL.  2006-08-23 14:39:26 [32448] [40] ERROR: SMPP[hutchorange]: SMSC rejected login to receive, code 0x000f (Invalid System ID).  2006-08-23 14:39:26 [32448] [40] ERROR: SMPP[hutchorange]: I/O error or other error. Re-connecting.  2006-08-23 14:39:26 [32447] [39] WARNING: SMPP: PDU NULL terminated string has no NULL.  2006-08-23 14:39:26 [32447] [39] ERROR: SMPP[hutchorange]: SMSC rejected login to transmit, code 0x000f (Invalid System ID).  2006-08-23 16:24:12 [4905] [3] INFO: HTTP: Re-starting smsc-id `hutchorange'  2006-08-23 16:24:12 [4905] [3] INFO: Set throughput to 13.000 for smsc id hutchorange  2006-08-23 16:24:12 [4905] [3] INFO: DLR rerouting for smsc id hutchorange disabled.  2006-08-23 16:24:13 [16991] [42] WARNING: SMPP: PDU NULL terminated string has no NULL.  2006-08-23 16:24:13 [16991] [42] ERROR: SMPP[hutchorange]: SMSC rejected login to receive, code 0x000f (Invalid System ID).  2006-08-23 16:24:13 [16991] [42] ERROR: SMPP[hutchorange]: I/O error or other error. Re-connecting.  2006-08-23 16:24:13 [16990] [41] WARNING: SMPP: PDU NULL terminated string has no NULL.  2006-08-23 16:24:13 [16990] [41] ERROR: SMPP[hutchorange]: SMSC rejected login to transmit, code 0x000f (Invalid System ID).  2006-08-23 16:24:38 [4905] [3] INFO: HTTP: Re-starting smsc-id `hutchorange'  2006-08-23 16:24:38 [4905] [3] INFO: Set throughput to 13.000 for smsc id hutchorange  2006-08-23 16:24:38 [4905] [3] INFO: DLR rerouting for smsc id hutchorange disabled.  2006-08-23 16:24:38 [17116] [44] WARNING: SMPP: PDU NULL terminated string has no NULL.  2006-08-23 16:24:38 [17116] [44] ERROR: SMPP[hutchorange]: SMSC rejected login to receive, code 0x000f (Invalid System ID).  2006-08-23 16:24:38 [17116] [44] ERROR: SMPP[hutchorange]: I/O error or other error. Re-connecting.  2006-08-23 16:24:38 [17115] [43] WARNING: SMPP: PDU NULL terminated string has no NULL.  2006-08-23 16:24:38 [17115] [43] ERROR: SMPP[hutchorange]: SMSC rejected login to transmit, code 0x000f (Invalid System ID).  2006-08-23 16:24:43 [16990] [41] PANIC: gwlib/octstr.c:2461: seems_valid_real: Assertion `ostr-len + 1 = ostr-size' failed. (Called from gw/smsc/smsc_smpp.c:1811:io_thread.)  2006-08-23 16:24:43 [16990] [41] PANIC: /opt/kannel/sbin/bearerbox(gw_panic+0xfd) [0x80b794d]  2006-08-23 16:24:43 [16990] [41] PANIC: /opt/kannel/sbin/bearerbox [0x80bcddc]  2006-08-23 16:24:43 [16990] [41] PANIC: /opt/kannel/sbin/bearerbox(octstr_get_cstr_real+0x20) [0x80b8bf0]  2006-08-23 16:24:43 [16990] [41] PANIC: /opt/kannel/sbin/bearerbox [0x8080b5e]  2006-08-23 16:24:43 [16990] [41] PANIC: /opt/kannel/sbin/bearerbox [0x80af177]  2006-08-23 16:24:43 [16990] [41] PANIC: 

Re: Fwd: Bearerbox panic

2006-08-24 Thread Giulio Harding
Stuart's just left for the day, but I've picked out line 1811 in 
function io_thread in gw/smsc/smsc_smpp.c from the source code we're using:


...

   /* unbind
* Read so long as unbind_resp received or timeout passed. 
Otherwise

we have
* double delivered messages.
*/
   if (smpp-quitting) {
   send_unbind(smpp, conn);
   last_response = time(NULL);
   while(conn_wait(conn, 1.00) != -1 
 difftime(time(NULL), last_response)  
SMPP_DEFAULT_SHUTDOW

N_TIMEOUT 
 smpp-conn-status != SMSCCONN_DISCONNECTED) {
   if (read_pdu(smpp, conn, len, pdu) == 1) {
   dump_pdu(Got PDU:, smpp-conn-id, pdu);
   handle_pdu(smpp, conn, pdu, pending_submits);
   smpp_pdu_destroy(pdu);
   }
   }
   debug(bb.sms.smpp, 0, SMPP[%s]: %s: break and 
shutting down,
 octstr_get_cstr(smpp-conn-id), __func__); 
== Line1811


   break;
   }

   send_enquire_link(smpp, conn, last_enquire_sent);

...

Hope this helps...

(just on a side-note, and this may be related: on the day this crash 
occured, later at night, we had an incident with the same carrier where 
one of our transmit binds to them died - they were rejecting our bind 
enquiries and then bind requests claiming we were already bound - when I 
tried to restart the individual link using the stop-smsc and start-smsc 
HTTP commands, Kannel hung, no longer processing MOs/MTs, and the admin 
page not responding. I tried to stop kannel, in order to restart it, but 
stopping it caused ~30 bearerbox processes to appear, each of which had 
to be kill -9'd before I could restart kannel. Not sure if this would 
related or not, but I haven't had a chance to investigate further... 
just thought I'd mention it in case...)


Thanks,

Andreas Fink wrote:



I was quickly looking through the CVS code of today and wonder if this 
was maybe created by the fact that login is being rejected but enquire 
link tries to send its message anyway and then crashes.



Could you tell me exactly on your CVS 20060727 version where is line 
1811 in that version?




On 24.08.2006, at 06:03, Mi Reflejo wrote:



Hi,

Which version do you use? If you are not using CVS HEAD please try it.


Regards,

Martin Conte.


On 8/23/06, Stuart Beck [EMAIL PROTECTED] 
mailto:[EMAIL PROTECTED] wrote:



Hi All,


We recently had a situation where one of the carriers we are 
connecting to (2 separate SMSC's) had some unspecified issue and 
reconnected, this however caused the following panic


as one of the SMSC's had been turned off (upon communicating with 
the carrier it's been switched off for about a week, they are no 
longer supporting that service)



I would like to know if anyone can give me any indication as to why 
the failure occurred and what can be done about it.



The Kannel/host details are as follows

Kannel bearerbox version `cvs-20060727'.

Build `Aug 2 2006 12:31:01', compiler `3.2 20020903 (Red Hat 
Linux 8.0 3.2-7)'.


System Linux, release 2.4.20-19.8smp, version #1 SMP Tue Jul 
15 15:01:43 EDT 2003, machine i686.


Libxml version 2.6.9. Using OpenSSL 0.9.6b [engine] 9 Jul 
2001. Using native malloc.



 2006-08-23 14:39:16 [32374] [38] ERROR: SMPP[hutchthree]: I/O 
error or other error. Re-connecting.


 2006-08-23 14:39:16 [32374] [38] ERROR: SMPP[hutchthree]: Couldn't 
connect to SMS center (retrying in 10 seconds).


 2006-08-23 14:39:16 [32448] [40] ERROR: SMPP[hutchorange]: I/O 
error or other error. Re-connecting.


 2006-08-23 14:39:16 [32448] [40] ERROR: SMPP[hutchorange]: 
Couldn't connect to SMS center (retrying in 10 seconds).


 2006-08-23 14:39:16 [32447] [39] ERROR: SMPP[hutchorange]: I/O 
error or other error. Re-connecting.


 2006-08-23 14:39:16 [32447] [39] ERROR: SMPP[hutchorange]: 
Couldn't connect to SMS center (retrying in 10 seconds).


 2006-08-23 14:39:16 [32373] [37] ERROR: SMPP[hutchthree]: I/O 
error or other error. Re-connecting.


 2006-08-23 14:39:16 [32373] [37] ERROR: SMPP[hutchthree]: Couldn't 
connect to SMS center (retrying in 10 seconds).


 2006-08-23 14:39:26 [32448] [40] WARNING: SMPP: PDU NULL 
terminated string has no NULL.


 2006-08-23 14:39:26 [32448] [40] ERROR: SMPP[hutchorange]: SMSC 
rejected login to receive, code 0x000f (Invalid System ID).


 2006-08-23 14:39:26 [32448] [40] ERROR: SMPP[hutchorange]: I/O 
error or other error. Re-connecting.


 2006-08-23 14:39:26 [32447] [39] WARNING: SMPP: PDU NULL 
terminated string has no NULL.


 2006-08-23 14:39:26 [32447] [39] ERROR: SMPP[hutchorange]: SMSC 
rejected login to transmit, code 0x000f (Invalid System ID).


 2006-08-23 16:24:12 [4905] [3] INFO: HTTP: Re-starting smsc-id 
`hutchorange'


 2006-08-23 16:24:12 [4905] [3] INFO: Set throughput to 13.000 for