Not the same problem I'm getting. Looks much worse though :( I'll try getting around to figuring out a sure way of reproducing the problem (if I can find the time).
cheers, Frederik Ammitzbøll Unwire Vestergade 12A, 3. 1456 København K Tlf.: +45 33 33 08 70 Mobil: +45 27 11 99 99 Fax : +45 33 33 09 70 Web: www.unwire.dk > Ok, this is what I got : > 4 clients, running simultanisly with no keep-alive, each delivering as > fast as possible. after about a 100 messages, I get this in the logs : > > 2002-01-28 14:45:31 [3] INFO: sendsms used by <tester> > 2002-01-28 14:45:31 [3] INFO: sendsms sender:<tester:123> > (192.168.1.220) to:< > 00447796440924> msg:<test(9)> > 2002-01-28 14:45:31 [3] INFO: Mutex gwlib/list.c:76: 1 locks, 0 > collisions. > 2002-01-28 14:45:31 [3] INFO: Mutex gwlib/list.c:75: 3 locks, 0 > collisions. > 2002-01-28 14:45:31 [3] DEBUG: Status: 202 Answer: <Sent.> > 2002-01-28 14:45:31 [3] INFO: Mutex gwlib/list.c:76: 1 locks, 0 > collisions. > 2002-01-28 14:45:31 [3] INFO: Mutex gwlib/list.c:75: 24 locks, 0 > collisions. > 2002-01-28 14:45:31 [3] INFO: Mutex gwlib/list.c:76: 1 locks, 0 > collisions. > 2002-01-28 14:45:31 [3] INFO: Mutex gwlib/list.c:75: 184 locks, 0 > collisions. > 2002-01-28 14:45:31 [3] DEBUG: HTTP: Resetting HTTPClient for > `192.168.1.220'. > 2002-01-28 14:45:31 [2] DEBUG: HTTP: Creating HTTPClient for > `192.168.1.220'. > 2002-01-28 14:45:31 [1] INFO: Mutex gwlib/list.c:76: 1 locks, 0 > collisions. > 2002-01-28 14:45:31 [1] INFO: Mutex gwlib/list.c:75: 7 locks, 0 > collisions. > 2002-01-28 14:45:31 [1] DEBUG: HTTP: Destroying HTTPClient area > 0x80aaea8. > 2002-01-28 14:45:31 [1] DEBUG: HTTP: Destroying HTTPClient for > `192.168.1.220' > . > 2002-01-28 14:45:31 [1] INFO: Mutex gwlib/conn.c:434: 12 locks, 1 > collisions. > 2002-01-28 14:45:31 [1] INFO: Mutex gwlib/conn.c:435: 4 locks, 0 > collisions. > 2002-01-28 14:45:31 [3] ERROR: mutex_unlock: Mutex failure! called from > file g > wlib/conn.c at line 144 > 2002-01-28 14:45:31 [3] ERROR: System error 22: Invalid argument > 2002-01-28 14:45:31 [3] INFO: smsbox: Got HTTP request > </cgi-bin/sendsms> from > <192.168.1.220> > 2002-01-28 14:45:31 [3] INFO: sendsms used by <tester> > 2002-01-28 14:45:31 [3] INFO: sendsms sender:<tester:123> > (192.168.1.220) to:< > 00447796440924> msg:<test(70)> > > and then smsbox dies. I don't know what happens here > > Oded Arbel > m-Wise Inc. > [EMAIL PROTECTED] > > -- > 34% of those who voted Republican in the last election believe Forrest > Gump is a documentary. > -- TV Nation Poll > > > > -----Original Message----- > > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] > > Sent: Monday, January 28, 2002 2:18 PM > > To: KannelDevel > > Subject: SV: [summary] HTTP/1.1 client requests causing CPU > > cycle bug/problem > > > > > > I'm not making this up you know ;) > > > > If you tested by sending one http-request at a time you won't see this > > problem. It only occurs when multiple processes/threads send > > simultaneously. > > > > cheers, > > > > Frederik Ammitzbøll > > Unwire > > Vestergade 12A, 3. > > 1456 København K > > Denmark > > > > Tlf.: +45 33 33 08 70 > > Mobil: +45 27 11 99 99 > > Fax : +45 33 33 09 70 > > Web: www.unwire.dk > > > > > > > -----Oprindelig meddelelse----- > > > Fra: Oded Arbel [mailto:[EMAIL PROTECTED]] > > > Sendt: 28. januar 2002 13:03 > > > Til: Frederik Ammitzb?ll > > > Emne: RE: [summary] HTTP/1.1 client requests causing CPU cycle > > > bug/problem > > > > > > > > > Hi. > > > > > > I tried to reproduce the problem by submitting http messages to the > > > smsbox at a high rate (about 200 messages a minute), and it > > did take up > > > some CPU, but it didn't thrash, and as soon as I stopped with the > > > messages everything went back to normal. > > > > > > We're running latest CVS, on Mandrake 7.2 > > > > > > Oded Arbel > > > m-Wise Inc. > > > [EMAIL PROTECTED] > > > > > > -- > > > Any sufficiently advanced bug is indistinguishable from a feature. > > > -- Kulawiec > > > > > > > -----Original Message----- > > > > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] > > > > Sent: Monday, January 28, 2002 1:44 PM > > > > To: KannelDevel > > > > Subject: SV: [summary] HTTP/1.1 client requests causing CPU > > > > cycle bug/problem > > > > > > > > > > > > The problem of heavy CPU-usage still occurs even with > > > > HTTP/1.1 handling > > > > turned off, though in a different manner than described by > > > > Stipe. Multiple > > > > threads sending lots of simultaneous http-requests to Kannel > > > > can have the > > > > exact same effect of sending the boxes in a powerfrenzy. > > > > > > > > I remember having seen one of our programmers testing the > > > > sendsms interface > > > > with some very early java code which had some (simple) > > > > problems with closing > > > > a http-transaction. The code would do an initial first > > http-request to > > > > Kannel which wasn't closed properly for some reason, and the > > > > second the code > > > > did http-request no. 2 smsbox would eat all of CPU. This we > > > > tried several > > > > times - always with the same result! (note that this was > > > > AFTER I turned 1.1 > > > > handling off). It's too long ago for me to show any log files > > > > I'm afraid. > > > > > > > > There seems to be a problem with Kannel handling multiple > > > > http-requests at > > > > the same time. We always have to restart Kannel at least a > > > > couple of times a > > > > week because of smsbox gone crazy, and it always occurs after > > > > periods of > > > > heavy traffic i.e. after lots and lots of http-requests > > > > having been sent to > > > > Kannel. > > > > > > > > BTW we're running redhat 6.2. > > > > > > > > cheers, > > > > > > > > Frederik Ammitzbøll > > > > Unwire > > > > Vestergade 12A, 3. > > > > 1456 København K > > > > > > > > Tlf.: +45 33 33 08 70 > > > > Mobil: +45 27 11 99 99 > > > > Fax : +45 33 33 09 70 > > > > Web: www.unwire.dk > > > > > > > > > > > > > Inspired by Jörgs repost of the orginal problem reported by: > > > > > > > > > > Frederik Ammitzbøll <[EMAIL PROTECTED]> > > > > > Tue, 17 Jul 2001 12:07:02 +0200 > > > > > Msg-ID: <[EMAIL PROTECTED]> > > > > > > > > > > I did some in-deep investigations on this and here is a > > draft of the > > > > > current summary: > > > > > > > > > > As Frederik pointed out there is a *huge* impact when HTTP > > > > interfaces > > > > > are used by HTTP/1.1 compliant user agents (especialy IE). > > > > When *any* > > > > > HTTP server running as a thread in *any* box is > > triggered using a > > > > > HTTP/1.1 request by IE the request gets served and ~1 > > > > minute later the > > > > > box runs up to 99% of CPU cycles. > > > > > > > > > > Keep in mind that this has impact on *any* box, which means: > > > > > > > > > > bearerbox: for admin HTTP interface > > > > > smsbox: for sendsms/sendota HTTP interface > > > > > wapbox: for PPG interface > > > > > > > > > > due that all incorporate Kannel's HTTP server > > implementation from > > > > > gwlib/http.c. > > > > > > > > > > Here is a log to reproduce the effects seen: > > > > > > > > > > $ cvs update gateway > > > > > $ cd gateway > > > > > $ ./configure --with-defaults=speed > > > > > $ make progs > > > > > $ cd gw > > > > > $ ./bearerbox wapkannel.conf > > > > > > > > > > now at least bearerbox is running and listening to port > > 13000 for > > > > > administration commands. > > > > > > > > > > Now use URL http://<hostname>:13000/ to access the > > > > bearerbox from your > > > > > MS Internet Explorer. (Unfortunatly I could not > > reproduce the effect > > > > > using HTTP/1.1 requests send using telnet by hand, which is > > > > indicating > > > > > that IE produces some kind of effect here!) > > > > > > > > > > After ~1 minute you get the following effects on the platforms: > > > > > > > > > > i686-pc-linux-gnu (SuSE 7.x): > > > > > 2002-01-26 02:39:45 [4] DEBUG: HTTP: Resetting HTTPClient for > > > > > '1.2.3.4'. > > > > > (~1 minute passes) > > > > > (CPU cycle rate for bearerbox hits 99%) > > > > > > > > > > sparc-sun-solaris2.6 (E250): > > > > > 2002-01-26 02:00:13 [3] DEBUG: HTTP: Resetting HTTPClient for > > > > > '1.2.3.4'. > > > > > 2002-01-26 02:01:13 [1] ERROR: Error reading from fd 20: > > > > > 2002-01-26 02:01:13 [1] ERROR: System error 131: > > Connection reset by > > > > > peer > > > > > 2002-01-26 02:01:13 [1] DEBUG: HTTP: Destroying HTTPClient area > > > > > 16f448. > > > > > 2002-01-26 02:01:13 [1] DEBUG: HTTP: Destroying HTTPClient for > > > > > '1.2.3.4'. > > > > > (CPU cylce rate for bearerbox stays normal) > > > > > > > > > > i686-pc-cygwin (Cygwin 1.3.9): > > > > > 2002-01-26 02:39:10 [3] DEBUG: HTTP: Resetting HTTPClient for > > > > > '1.2.3.4'. > > > > > 2002-01-26 02:40:10 [1] ERROR: Error reading from fd 39: > > > > > 2002-01-26 02:40:10 [1] ERROR: System error 104: > > Connection reset by > > > > > peer > > > > > 2002-01-26 02:40:10 [1] ERROR: Error reading from fd 39: > > > > > 2002-01-26 02:40:10 [1] ERROR: System error 104: > > Connection reset by > > > > > peer > > > > > 2002-01-26 02:40:10 [1] DEBUG: HTTP: Destroying HTTPClient area > > > > > 0x100dacc0. > > > > > 2002-01-26 02:40:10 [1] DEBUG: HTTP: Destroying HTTPClient for > > > > > '1.2.3.4'. > > > > > (CPU cycle rate for bearerbox stays normal) > > > > > > > > > > > > > > > Note the 1 minute (!) difference in the Solaris *and* Cygwin log > > > > > between the resetting of the HTTPClient and the error! > > > > > > > > > > So we have a different behaviour for the same situation > > on Linux and > > > > > Solaris. Linux does not give any error message, but the > > thread seems > > > > > to burn the CPU time. Solaris and Cygwin instead > > screems but leaves > > > > > the CPU ticks alone. > > > > > > > > > > I guess we have some problem with TCP communication > > from IE to the > > > > > HTTP servers of Kannel here?! > > > > > > > > > > Any other testers for MacOS X and FreeBSD, who may run > > this scenario > > > > > and report what their OS does? > > > > > > > > > > > > > > > Stipe > > > > > > > > > > [EMAIL PROTECTED] > > > > > > > ------------------------------------------------------------------- > > > > > Wapme Systems AG > > > > > > > > > > Münsterstr. 248 > > > > > 40470 Düsseldorf > > > > > > > > > > Tel: +49-211-74845-0 > > > > > Fax: +49-211-74845-299 > > > > > > > > > > E-Mail: [EMAIL PROTECTED] > > > > > Internet: http://www.wapme-systems.de > > > > > > > ------------------------------------------------------------------- > > > > > wapme.net - wherever you are > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >