keeping varnishstat open will bring down server

Angelo Höngens Tue, 13 Apr 2010 06:10:12 -0700

Hey guys,

I've seen something I'd like to share with you, perhaps it could be seen as a 
bug in varnishstat.


Yesterday I opened ssh sessions to my 4 balancers, to run some scripts, and 
then I opened varnishstat to monitor them. A while later I had to leave in a 
rush and closed my laptop's lid, and in that process killed my vpn tunnel and 
ssh sessions. However, the varnishstat process (apparently) keeps running. 
(FreeBSD 7.2 x64)

Just a few hours ago (so around 16 hours later), I had one balancer die on my 
(become completely unresponsive, refuse connections to port 80). I immediately 
restarted varnishd, and I also saw a varnishstat instance eat 100% cpu, which I 
killed.

Now when I just looked on the other balancers, I see the varnishstat instance 
using up a lot of CPU (only one out of 4 cores though):


last pid: 77863;  load averages:  1.40,  1.48,  1.47     up 105+00:24:26 
14:56:40
166 processes: 2 running, 164 sleeping
CPU: 27.1% user,  0.0% nice,  4.2% system,  1.9% interrupt, 66.8% idle
Mem: 6430M Active, 550M Inact, 709M Wired, 189M Cache, 399M Buf, 32M Free
Swap: 4096M Total, 228M Used, 3868M Free, 5% Inuse

  PID USERNAME  THR PRI NICE   SIZE    RES STATE  C   TIME   WCPU COMMAND
69587 root        1 112    0 95640K  1044K CPU3   3  19.1H 77.20% varnishstat
76211 haproxy     1   4    0 48928K 18944K kqread 1  16:34  3.17% haproxy
68762 www       116  44    0  8756M  6412M select 0   0:01  0.39% varnishd
31203 root        1  44    0   176M  5476K select 2 439:16  0.00% snmpd
69527 root        1   8    0 94312K 83384K nanslp 0  11:59  0.00% varnishncsa
37934 root        1   4    0 66244K  3164K kqread 0   8:46  0.00% squid
 1912 root        1  44    0 10484K   724K select 0   7:50  0.00% ntpd
 2036 root        1  44    0 85732K  3528K select 1   4:12  0.00% httpd
56664 root        1  44    0  5692K   616K select 2   0:51  0.00% syslogd
 2056 root        1   8    0  6748K   392K nanslp 2   0:33  0.00% cron
 2023 root        1   4    0  5808K   428K kqread 0   0:23  0.00% master
 2031 postfix     1   4    0  5808K   408K kqread 0   0:22  0.00% qmgr
76181 www         1   4    0 85732K  3732K kqread 3   0:01  0.00% httpd
76182 www         1  20    0 85732K  3716K lockf  3   0:01  0.00% httpd
76185 www         1  20    0 85732K  3696K lockf  2   0:01  0.00% httpd
76298 www         1  20    0 85732K  3868K lockf  3   0:01  0.00% httpd


So it seems running varnishstat for a long time, it will use more and more 
resources, and in my case, even cause varnishd to fail somehow (it could be a 
coincidence, but I don't think so).

After killing varnishstat, load went back from 1.5 to 0.2, around the usual.

-- 

 
With kind regards,
 
 
Angelo Höngens
 
Systems Administrator
 
------------------------------------------
NetMatch
tourism internet software solutions
 
Ringbaan Oost 2b
5013 CA Tilburg
T: +31 (0)13 5811088
F: +31 (0)13 5821239
 
mailto:[email protected]
http://www.netmatch.nl
------------------------------------------



_______________________________________________
varnish-misc mailing list
[email protected]
http://lists.varnish-cache.org/mailman/listinfo/varnish-misc

keeping varnishstat open will bring down server

Reply via email to