Happened again. This time I left varnishlog running, and got...
989 TxStatus c 503
989 TxResponse c Service Unavailable
989 TxHeader c Server: Varnish
989 TxHeader c Date: Wed, 09 Dec 2009 00:38:07 GMT
989 TxHeader c Age: 0
989 TxHeader c Connection: close
989 ReqEnd c 1933950257 1260319087.004358053 1260319087.004553556
0.67711 0.000121117 0.74387
989 SessionClose c error
989 StatSess c 189.165.32.95 1480 0 1 1 0 0 0 117 0
5 Debug- Too many open files when accept(2)ing. Sleeping.
EOF
That Too many... debug message appears 709 times. First one:
line number 14292239
1023 TxProtocol c HTTP/1.1
1023 TxStatus c 200
1023 TxResponse c OK
1023 TxHeader c Server: xxx
1023 TxHeader c Cache-Control: public, max-age=3600
1023 TxHeader c Content-Type: application/x-shockwave-flash
1023 TxHeader c Content-Length: 659
1023 TxHeader c Date: Wed, 09 Dec 2009 00:10:05 GMT
1023 TxHeader c Age: 972
1023 TxHeader c Connection: keep-alive
1023 ReqEnd c 1933931697 1260317405.392798424 1260317405.392909527
0.62466 0.48637 0.62466
706 Debugc herding
695 Debugc herding
1007 Debugc herding
1023 Debugc herding
5 Debug- Too many open files when accept(2)ing. Sleeping.
937 ReqStart c 190.13.3.173 51259 1933931698
937 RxRequestc GET
937 RxURLc xxx
937 RxProtocol c HTTP/1.1
937 RxHeader c Accept: */*
937 RxHeader c Referer: xxx
937 RxHeader c Accept-Language: es-co
937 RxHeader c UA-CPU: x86
937 RxHeader c Accept-Encoding: gzip, deflate
937 RxHeader c User-Agent: xxx
937 RxHeader c Host: xxx
937 RxHeader c Connection: Keep-Alive
937 RxHeader c Cookie: xxx=xxx
937 VCL_call c recv
937 VCL_return c lookup
937 VCL_call c hash
937 VCL_return c hash
937 VCL_call c miss
937 VCL_return c fetch
937 FetchError c no backend connection
937 VCL_call c error
937 VCL_return c deliver
Varnish account does not have limits (ulimit returns unlimited). Memory
usage (top): 403m/89m/81m.
# ps -AF | grep varnish
root 19223 1 0 21419 928 2 Dec08 ?00:00:00 varnishd -a
*:80 -f xxx.vcl -p obj_workspace 4096 -s malloc,100M -p thread_pools 4
nobody 19224 19223 0 103378 91244 3 Dec08 ?00:06:52 varnishd -a
*:80 -f xxx.vcl -p obj_workspace 4096 -s malloc,100M -p thread_pools 4
# kill 19224
# ps -AF | grep varnish
root 19223 1 0 21419 960 2 Dec08 ?00:00:00 varnishd -a
*:80 -f xxx.vcl -p obj_workspace 4096 -s malloc,100M -p thread_pools 4
nobody 31319 19223 1 94673 3532 3 12:33 ?00:00:00 varnishd -a
*:80 -f xxx.vcl -p obj_workspace 4096 -s malloc,100M -p thread_pools 4
That made it work again --as in processing requests--, except for varnishlog
that continues to show nothing.
- Original Message -
From: Fernando Papa fernando.p...@liveinteractive.net
To: varnish-misc@projects.linpro.no
Sent: Tuesday, December 08, 2009 2:01 PM
Subject: only 503s after a few hours (trunk-4384)
Hello,
After a few hours, Varnish trunk-4384 started replying with 503.
New work-load, first day of use. Previously used Varnish only with
internal
traffic (eg no crazy headers, just carefully crafted requests).
Facts:
* Xeon 2.4 GHz (quad-core, 512KBx4 cache) server. CentOS 5.4.
* Worked fine for ~9 hours, sending to backend 150+ K requests plus
Varnish
pings.
* Backend is an application server. Payload is 1 KB plus minimal
headers.
* I can still access backend bypassing Varnish.
* Varnish continues pinging backend, and receiving a timely 200 OK every
time (backend logs).
* req.backend.healthy = true. Tested with a codepath along the lines
(req.request == GET) (req.url == /) (req.backend.healthy), that
does an 'error 200 OK'.
* varnishlog does not output anything at all. Not even the backend pings.
* Tested every path within vcl_recv. Exiting with error works, but
lookup results in a 503.
* Very low CPU usage. Sorry, didn't check memory usage before restarting
it.
CFLAGS='-O2 -fomit-frame-pointer -march=pentium4 -mfpmath=sse'
./configure --prefix=$HOME/local --disable-largefile
$HOME/local/sbin/varnishd -a *:80 -f $HOME/local/etc/varnish/xxx.vcl -p
obj_workspace=4096 -s malloc,100M -p thread_pools=`grep -c 'model name'
/proc/cpuinfo`
backend fd1
{
.host=localhost;
.port=85;
.connect_timeout = 1s;
.first_byte_timeout = 20s;
.between_bytes_timeout =2s;
.probe = {
.request =
GET /xxx HTTP/1.1
Host: yyy
Connection: close;
.timeout = 0.5 s;
.window = 8;
.threshold = 3;
}
}
director fdcluster round-robin
{
{ .backend = fd1; }
}
Restarted Varnish, now it works again.
Any idea? Anything wrong in the command line?
Thanks,
Fernando