I am trying to post this for the third time here. Lets hope it works.
I am using CMUCL 19c on an x86-64 server (running Redhat Linux). We are running a website using TBNL (running on top of apache2/mod_lisp2) to generate dynamic pages. Also, we use trivial-http to send GET requests to another server. The problem used to be that whenever we had 1000 connections to that server, the GC would go into an infinite loop. Part of the problem was with the http-get function in trivial-http (as highlighted by John Wiseman here - http://lemonodor.com/archives/001145.html) - (defun http-get (url) (let* ((host (url-host url)) (port (url-port url)) (stream (open-stream host port))) (format stream "GET ~A HTTP/1.0~AHost: ~A~AUser-Agent: Trivial HTTP for Common Lisp~A~A" url +crlf+ host +crlf+ +crlf+ +crlf+) (force-output stream) (list (response-read-code stream) (response-read-headers stream) stream))) If no exception occurred while sending the request or reading the response (which was done by another function), the connection would close normally. However, if an excepton occurred, the stream would not be closed, and the connection would remain forever in the CLOSE_WAIT state. And when the no. of connections reaches 1000, the site would become completely unresponsive and on connecting to lisp, we see the GC occuring repeatedly (as if in an infinite loop). Also, lisp's CPU usage goes to 99%. On quitting lisp, all those connections are closed. So now I have fixed this by putting the whole request sending and response reading code inside an unwind-protect which always closes the stream. The total no. of connections at any time is quite low now. But I am not sure whether it was a bug in CMUCL or something else. Has this been observed before? If the no. of connections goes upto 1000 (most of them should be in ESTABLISHED state now), will the same thing happen again? Right now its not possible for us to benchark it with 1000 concurrent requests, so we have no idea what would happen when the number of sessions on our site goes upto a few thousand. Any pointers would be helpful. Thanks, Chaitanya