Yeap, It is terminal for the connection, just like errors are meant to be. It means that peer has abruptly hangup connection. fd means file descriptor, ie socket. You seem to have a lot of sockets open and the peer is abnormally terminating them while kannel is expecting to read input. You will get this error when kannel after establishing initial connection tries to write or read from it, and peer has sent a tcp fin.
Usually, this is a peer issue, ie localhost:9090 aka your java application. Kannel uses persistent connections (default) unless compiled with the option not to. That means that if connections are not closed properly, they accumulate and remain open. You can adjust the limit in Linux by ulimit. But more correctly you should make sure that you close properly any sockets you open from your java application. Is every fd=socket() call matched by a close(fd) call? When servers reach their limit on open file desciptors, they react unpredictably. Solaris will freeze for ~5' and then resume or reboot. BR, Nikos ----- Original Message ----- From: Marcelo Olivas To: Nikos Balkanas Cc: [email protected] ; Guillermo Rendon Sent: Thursday, July 23, 2009 1:06 AM Subject: Re: Error 500 in Kannel's HTTP Hey Nikos, I've been taken a look at the logs, and unfortunately there is no much that I found out. I do see a lot of "ERROR: reading from fd ##". Is that bad? Stupid question: what does "fd" means? 2009-07-20 19:00:20 [7205] [1] ERROR: Error reading from fd 28: 2009-07-20 19:00:20 [7205] [1] ERROR: System error 104: Connection reset by peer 2009-07-20 19:07:34 [7205] [1] ERROR: Error reading from fd 32: 2009-07-20 19:07:34 [7205] [1] ERROR: System error 104: Connection reset by peer 2009-07-20 19:09:47 [7205] [1] ERROR: Error reading from fd 32: 2009-07-20 19:09:47 [7205] [1] ERROR: System error 104: Connection reset by peer 2009-07-20 19:18:59 [7205] [1] ERROR: Error reading from fd 31: 2009-07-20 19:18:59 [7205] [1] ERROR: System error 104: Connection reset by peer 2009-07-20 19:28:05 [7205] [1] ERROR: Error reading from fd 29: 2009-07-20 19:28:05 [7205] [1] ERROR: System error 104: Connection reset by peer 2009-07-20 20:37:52 [7205] [1] ERROR: Error reading from fd 30: 2009-07-20 20:37:52 [7205] [1] ERROR: System error 104: Connection reset by peer 2009-07-21 01:55:43 [7205] [8] ERROR: Couldn't fetch <http://localhost:9090/midcgw/UpmobileSMSHandler?sender=9711078748&receiver=33123&text=2256+81365392151204318&binary=2256+81365392151204318&time=2009-07-21+05:51:32&smsc-id=TELCEL33123_MX&SMS-ID=4889c38e-e730-4bb6-acca-fc48f5283c29&DeliveryValue=-1&DeliveryReportReply=2256+81365392151204318&sendsms-user=default&message-coding=0&message-class-bits=-1&mwi=-1&message-charset=UTF8&udh=&billing=&account=&serviceid=%v&sessionid=%w&meta-data=%3Fsmpp%3F> 2009-07-20 19:59:12 [7210] [8] ERROR: Error reading from fd 30: 2009-07-20 19:59:12 [7210] [8] ERROR: System error 104: Connection reset by peer 2009-07-20 19:59:12 [7210] [8] ERROR: Couldn't fetch <http://10.10.20.10:9090/midcgw/dlr?uuid=3a7ceed5-4299-4771-a4f6-e4c9e724a46d&dlr-status=8&dlr-errcode=&dlr-tlvs=®istered_delivery=1>2009-07-21 01:57:33 [7210] [8] ERROR: Couldn't fetch<http://localhost:9090/midcgw/UpmobileSMSHandler?sender=5527698584&receiver=55202&text=Sexy&binary=Sexy&time=2009-07-21+05:51:36&smsc-id=TELCEL5_MX&SMS-ID=79b4761c-23a4-4ba0-aa65-c2dc913ff35c&DeliveryValue=1&DeliveryReportReply=Sexy&sendsms-user=default&message-coding=0&message-class-bits=-1&mwi=-1&message-charset=UTF-8&udh=&billing=&account=&serviceid=%v&sessionid=%w&meta-data=%3Fsmpp%3F>2009-07-21 01:57:33 [7210] [8] ERROR: Couldn't fetch <http://localhost:9090/midcgw/UpmobileSMSHandler?sender=%2B7876280600&receiver=%2B55225&text=Picante&binary=Picante+&time=2009-07-21+05:52:41&smsc-id=Centennial_PR&SMS-ID=1efdffdf-e361-4076-a4e6-8e7589fe9f0b&DeliveryValue=-1&DeliveryReportReply=Picante+&sendsms-user=default&message-coding=0&message-class-bits=-1&mwi=-1&message-charset=UTF-8&udh=&billing=&account=&serviceid=%v&sessionid=%w&meta-data=%3Fsmpp%3F> 2009-07-21 02:01:46 [7215] [8] ERROR: Couldn't fetch <http://localhost:9090/midcgw/UpmobileSMSHandler?sender=6391107103&receiver=33123&text=Melate+2256&binary=Melate+2256&time=2009-07-21+05:57:44&smsc-id=TELCEL33123_MX&SMS-ID=6c398ecb-5d13-428a-9ec5-f7f2ed06d238&DeliveryValue=-1&DeliveryReportReply=Melate+2256&sendsms-user=default&message-coding=0&message-class-bits=-1&mwi=-1&message-charset=UTF-8&udh=&billing=&account=&serviceid=%v&sessionid=%w&meta-data=%3Fsmpp%3F> Hi, I think that Nagios maybe screwing kannel. Destroying HTTP client is a standard action. When an HTTP request reaches kannel, it creates an HTTP client, and when it finishes it destroys it to avoid memory leaks. You have to figure out what is fd 190. Possibly your smsbox. Send 10 lines +/- from your smsbox log error. There should be more entries about the failure and the reason. I suspect your server might be running out of sockets (file descriptors). BR, Nikos ----- Original Message ----- From: Marcelo Olivas To: Nikos Balkanas Cc: [email protected] ; Tino Cuesta Sent: Tuesday, July 21, 2009 11:09 PM Subject: Re: Error 500 in Kannel's HTTP Nikos, sorry for the confusion. Nagios is just a Linux Monitoring Application. It checks the status of my connections using the status.xml from the Kannel's admin module. The peer for the bearerbox is a Java application running on Tomcat using port 9090. Both, the Kannel and Tomcat applications are running in the same server. The weird thing is that I don't see any error in the Tomcat. At the beginning I thought it was a network hiccup; however, this has happened more than 3 times now. Below is my configuration for the BB: ------------------------------------------------------------------------------------- group = core admin-port = 13000 smsbox-port = 13001 admin-password = secret status-password = scret2 log-file = "/opt/kannel/logs/bearerbox.log" log-level = 0 access-log = "/opt/kannel/logs/access_bearerbox.log" store-type = spool store-location = "/opt/kannel/var/spool/bearerbox" dlr-storage = mysql black-list = "http://localhost:9090/midcgw/blacklist.txt" # # Include the bearerbox DLR storage type. # include = "/opt/kannel/etc/module.d/dlr-storage.conf" # # The upstream SMSC connection configurations we use. # include = "/opt/kannel/etc/smsc.d" # # A kludge smsbox group. Bearerbox at least needs to know # that it should open the smsbox-port by detecing at least # a smsbox group here. group = smsbox ------------------------------------------------------------------------------------- I followed the logs and this is what I noticed: 2009-07-21 02:01:52 [7125] [3] DEBUG: HTTP: Destroying HTTPClient area 0x8ac08888. 2009-07-21 02:01:52 [7125] [3] DEBUG: HTTP: Destroying HTTPClient for `xx.xx.xx.16'. 2009-07-21 02:01:52 [7125] [1] DEBUG: HTTP: Destroying HTTPClient area 0x8ac04680. 2009-07-21 02:01:52 [7125] [1] DEBUG: HTTP: Destroying HTTPClient for `xx.xx.xx.16'. 2009-07-21 02:01:52 [7125] [1] ERROR: Error writing 418 octets to fd 190: 2009-07-21 02:01:52 [7125] [1] ERROR: System error 32: Broken pipe 2009-07-21 02:01:52 [7125] [76] DEBUG: send_msg: sending msg to box: <127.0.0.1> 2009-07-21 02:01:52 [7125] [76] DEBUG: boxc_sender: sent message to <127.0.0.1> 2009-07-21 02:01:52 [7125] [3] DEBUG: HTTP: Destroying HTTPClient area 0x8ac1b4c8. 2009-07-21 02:01:52 [7125] [3] DEBUG: HTTP: Destroying HTTPClient for `xx.xx.xx.16'. 2009-07-21 02:01:52 [7125] [3] DEBUG: HTTP: Destroying HTTPClient area 0x8ac1b5e8. 2009-07-21 02:01:52 [7125] [3] DEBUG: HTTP: Destroying HTTPClient for `xx.xx.xx.16'. As I mentioned, Nagios is configured so that every 5 minutes it uses the admin module to check the status of my connections. Nagios IP is xx.xx.xx16. What does the "Destroying HTTPClient for xx.xx.xx.16" means and why is it doing it? Thanks again guys!! On Jul 21, 2009, at 3:01 PM, Nikos Balkanas wrote: Hi, I don't know Nagios. I am assuming from what you say, that fd 190 is your Nagios connection. Broken pipe means that bb's peer (Nagios?) has hanged the connection without sending a FIN (abnormally, network issue?). What is your localhost 9090? Seems not to be responding either. First you get the error in the smsbox, and then bb error follows. How about some configuration? BR, Nikos ----- Original Message ----- From: Marcelo Olivas To: [email protected] Cc: Tino Cuesta Sent: Tuesday, July 21, 2009 7:25 PM Subject: Error 500 in Kannel's HTTP Hi gurus!! I'm getting the following error in bearerbox: 2009-07-21 02:01:52 [7125] [1] ERROR: Error writing 418 octets to fd 190: 2009-07-21 02:01:52 [7125] [1] ERROR: System error 32: Broken pipe The error gives me an alert in my Nagios saying that all my connections (SMPP) are down. I'm getting this error while using the status admin XML page. With the smsbox I'm getting: 2009-07-21 02:01:35 [7205] [8] ERROR: Couldn't fetch <http://localhost:9090/midc gw/UpmobileSMSHandler?sender=xxxxx&receiver=xxx&text=text&time=2009-07-21+05:56:37&smsc-id=TELCEL&SMS-I D=0498316d-046a-4a4b-bf3c-92edf6b74cce&DeliveryValue=-1&DeliveryReportReply=Mv+s e%3For+toca+el+corazon+d+mi+bordito+y+regresa+sus+pasos+a+nuestro+hogar+para+sie mpre%2C+gracias&sendsms-user=default&message-coding=0&message-class-bits=-1&mwi= -1&message-charset=UTF-8&udh=&billing=&account=&serviceid=%v&sessionid=%w&meta-d ata=%3Fsmpp%3F> The error is always happening at this time. During this time I get the alert in my Nagios server, and in about 5-10 minutes everything seems fine. Can you help me?? Thanks, Marcelo
