Re: varnish crashes

2010-01-25 Thread Angelo Höngens
On 24-1-2010 20:31, Michael S. Fischer wrote:

 The other most common reason why the varnish supervisor can start
 killing off children is when they are blocked waiting on a page-in,
 which is usually due to VM overcommit (i.e., storage file size
 significantly exceeds RAM and you have a very large hot working set).  
 You can usually see that when iostat -x output shows the I/O busy % is
 close to 100, meaning the disk is saturated.   You can also see that in
 vmstat (look at the pi/po columns if you're using a file, or si/so if
 you're using malloc).


Well, my balancers have 8GB ram, and were using a 350GB backend file..

I saw disk io was really high, 174% busy is kinda busy :)

extended device statistics
device r/s   w/skr/skw/s wait svc_t  %b
ad4  100.3  85.2  4101.1  1355.92  37.5 174
ad6  102.5  85.2  4092.7  1355.93  29.1 150

So thanks for that! I now set the backend to an 8GB file, and I hope
that will be better..

Do you have any recommendations, except buying faster disks?

With Squid I was used to filling up the 300GB disks (we also serve large
images), but I guess Varnish does not work that way..
-- 


With kind regards,


Angelo Höngens
systems administrator

MCSE on Windows 2003
MCSE on Windows 2000
MS Small Business Specialist
--
NetMatch
tourism internet software solutions

Ringbaan Oost 2b
5013 CA Tilburg
+31 (0)13 5811088
+31 (0)13 5821239

a.hong...@netmatch.nl
www.netmatch.nl
--


___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: varnish crashes

2010-01-25 Thread Poul-Henning Kamp
In message 4b5d70b5.5080...@netmatch.nl, =?ISO-8859-1?Q?Angelo_H=F6ngens?= wr
ites:

 How are your disks configured ?

2 cheap SATA disks in a gmirror (it's a simple Dell R300).

Hmm, that's going to hurt obviously...

You would probably have been better off, not mirroring and giving
Varnish a -sfile for each disk.

-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: varnish crashes

2010-01-25 Thread Angelo Höngens
On 25-1-2010 11:24, Poul-Henning Kamp wrote:
 In message 4b5d70b5.5080...@netmatch.nl, =?ISO-8859-1?Q?Angelo_H=F6ngens?= 
 wr
 ites:
 
 How are your disks configured ?

 2 cheap SATA disks in a gmirror (it's a simple Dell R300).
 
 Hmm, that's going to hurt obviously...
 
 You would probably have been better off, not mirroring and giving
 Varnish a -sfile for each disk.

I'll take it into consideration, but first I'm going to run with the
current configuration for a while to make sure varnish keeps responding.
The disks are now 1-3% busy, and everything seems to run nice..


-- 


With kind regards,


Angelo Höngens
systems administrator

MCSE on Windows 2003
MCSE on Windows 2000
MS Small Business Specialist
--
NetMatch
tourism internet software solutions

Ringbaan Oost 2b
5013 CA Tilburg
+31 (0)13 5811088
+31 (0)13 5821239

a.hong...@netmatch.nl
www.netmatch.nl
--


___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: varnish crashes

2010-01-24 Thread Angelo Höngens
On 23-1-2010 20:57, Michael Fischer wrote:
 On Sat, Jan 23, 2010 at 2:20 AM, Angelo Höngens a.hong...@netmatch.nl
 mailto:a.hong...@netmatch.nl wrote:
 
 
 (second try, I found out I was subscribed using a wrong email address)
 
 Hey,
 
 I am having some problems with Varnish. Unfortunately (depends on how
 you look at it), I had to replace our Squid cluster with Varnish in a
 day.. And now, we are finding out we're having some issues with it,
 sometimes Varnish just stops working.
 
 We have 4 balancers, each running FreeBSD 7.2 with 'device carp'
 compiled in. I haven't dared upgrade to 8.0 yet, because I had problems
 on my testmachine earlier with ipv6 and carp interfaces on 8.0.
 
 [ang...@nmt-nlb-06 ~]$ uname -a
 FreeBSD nmt-nlb-06.netmatchcolo1.local 7.2-RELEASE FreeBSD 7.2-RELEASE
 #0: Mon Jun 15 19:25:03 CEST 2009
 r...@nmt-nlb-06.netmatchcolo1.local:/usr/obj/usr/src/sys/NMT-NLB-06
  amd64
 
 Here's an example of a varnishd crashing, this is in /var/log/messages:
 
 Jan 23 09:49:39 nmt-nlb-06 varnishd[47478]: Child (47479) not responding
 to ping, killing it.
 Jan 23 10:49:43 nmt-nlb-06 kernel: pid 47479 (varnishd), uid 80: exited
 on signal 3
 Jan 23 09:49:43 nmt-nlb-06 varnishd[47478]: Child (47479) not responding
 to ping, killing it.
 Jan 23 09:49:43 nmt-nlb-06 varnishd[47478]: Child (47479) not responding
 to ping, killing it.
 Jan 23 09:49:43 nmt-nlb-06 varnishd[47478]: child (54810) Started
 Jan 23 09:49:48 nmt-nlb-06 varnishd[47478]: Pushing vcls failed: CLI
 communication error
 Jan 23 09:49:48 nmt-nlb-06 varnishd[47478]: Child (54810) said Closed
 fds: 4 5 6 7 11 12 14 15
 Jan 23 09:49:48 nmt-nlb-06 varnishd[47478]: Child (54810) said Child
 starts
 Jan 23 09:51:15 nmt-nlb-06 varnishd[47478]: Child (54810) said managed
 to mmap 2319266349056 bytes of 2319266349056
 Jan 23 09:51:15 nmt-nlb-06 varnishd[47478]: Child (54810) said Ready
 
 Does anyone know what could cause this?
 
 
 What is thread_pool_max set to?  Have you tried lowering it?   We have
 found that on systems with very high cache-hit ratios, 16 threads per
 CPU is the sweet spot to avoid context-switch saturation.

[ang...@nmt-nlb-03 ~]$ varnishadm -T localhost:81 param.show| grep
thread_pool

thread_pool_add_delay  20 [milliseconds]
thread_pool_add_threshold  2 [requests]
thread_pool_fail_delay 200 [milliseconds]
thread_pool_max500 [threads]
thread_pool_min5 [threads]
thread_pool_purge_delay1000 [milliseconds]
thread_pool_timeout300 [seconds]
thread_pools   2 [pools]

Thread_pool_max is set to 500 threads.. But I just increased it to 4000
(as per http://varnish.projects.linpro.no/wiki/Performance), as 'top'
shows me it's using around 480~490 threads now..

You suggest lowering it, what would be the effect of that? I would think
it would run out of threads or something? Well, we'll see what happens
with the increased threads..

I've also just increased thread_pools from 2 to 4.. (4 cores).

-- 


With kind regards,


Angelo Höngens
systems administrator

MCSE on Windows 2003
MCSE on Windows 2000
MS Small Business Specialist
--
NetMatch
tourism internet software solutions

Ringbaan Oost 2b
5013 CA Tilburg
+31 (0)13 5811088
+31 (0)13 5821239

a.hong...@netmatch.nl
www.netmatch.nl
--


___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: varnish crashes

2010-01-24 Thread Michael S. Fischer
On Jan 24, 2010, at 7:23 AM, Angelo Höngens wrote:
 What is thread_pool_max set to?  Have you tried lowering it?   We have
 found that on systems with very high cache-hit ratios, 16 threads per
 CPU is the sweet spot to avoid context-switch saturation.
 
 [ang...@nmt-nlb-03 ~]$ varnishadm -T localhost:81 param.show| grep
 thread_pool
 
 thread_pool_add_delay  20 [milliseconds]
 thread_pool_add_threshold  2 [requests]
 thread_pool_fail_delay 200 [milliseconds]
 thread_pool_max500 [threads]
 thread_pool_min5 [threads]
 thread_pool_purge_delay1000 [milliseconds]
 thread_pool_timeout300 [seconds]
 thread_pools   2 [pools]
 
 Thread_pool_max is set to 500 threads.. But I just increased it to 4000
 (as per http://varnish.projects.linpro.no/wiki/Performance), as 'top'
 shows me it's using around 480~490 threads now..
 
 You suggest lowering it, what would be the effect of that? I would think
 it would run out of threads or something? Well, we'll see what happens
 with the increased threads..


Increasing concurrency is unlikely to solve the problem, although setting the 
number of thread pools to the number of CPUs is probably a good idea.

Assuming a high hit ratio and high CPU utilization (you haven't posted either), 
lowering concurrency (i.e. reducing thread_pool_max) can help reduce CPU 
contention incurred by context switching.  

If maximum concurrency is reached, incoming connections will be deferred to the 
TCP listen(2) backlog (the overflowed_requests counter in varnishstat increases 
when this happens).   When the request reaches the head of the queue, it will 
then be picked up by a processing thread.  The net effect is some additional 
latency, but probably not as much as you're experiencing if your CPU is swamped 
with context switches.

There are a few cases where increasing thread_pool_max can help, in particular, 
where you have a high cache-miss ratio and you have slow origin servers.  But 
if CPU is already high, it will only make the problem worse.

BTW, on FreeBSD you can view the current length of the listen(2) backlog via 
netstat -aL  By default, varnishd's listen(2) backlog is 512; as long as you 
don't see the length hit that value you should be ok.

--Michael

___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: varnish crashes

2010-01-24 Thread Michael S. Fischer
On Jan 24, 2010, at 10:40 AM, Angelo Höngens wrote:
 
 According to top, the CPU usage for the varnishd process is 0.0% at 400
 req/sec. The load over the past 15 minutes is 0.45, probably mostly
 because of haproxy running on the same machine. So I don't think load is
 a problem.. My problem is that varnish sometimes just crashes or stops
 responding.
 
 My hit cache ratio is not that high, around 80%, and the backend servers
 can be slow at times (quite complex .net web apps). But I've changed
 some settings, and I am waiting for the next time varnish starts to stop
 responding.. I'm beginning to think it's something that grows over time,
 after restarting the varnish process things tend to run smooth for a
 while. I'll just keep monitoring it.

The other most common reason why the varnish supervisor can start killing off 
children is when they are blocked waiting on a page-in, which is usually due to 
VM overcommit (i.e., storage file size significantly exceeds RAM and you have a 
very large hot working set).   You can usually see that when iostat -x output 
shows the I/O busy % is close to 100, meaning the disk is saturated.   You can 
also see that in vmstat (look at the pi/po columns if you're using a file, or 
si/so if you're using malloc).

--Michael

___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: varnish crashes

2010-01-23 Thread Poul-Henning Kamp
In message 4b5ad8b0.6090...@netmatch.nl, =?ISO-8859-1?Q?Angelo_H=F6ngens?= wr
ites:

By the way: the balancers do a total of 2000 req/sec now, but when
stresstesting I can easily get 9000 cache/hits persec. So I don't think
it's hanging on the upper limits of its performance.

At that level of load, make sure to kldload the http accept filter.


Your varnish-stat looks pretty OK.

Have you configured health-polling of all those backends ?


-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: Varnish crashes everyday and same time

2007-11-22 Thread Janis Putrams
you need to write start after starting varnishd in debug mode.

janis
a satisfied Varnishd user :)

On Thursday 22 November 2007 12:18, Erik wrote:
 Hi again,

 I made some logging of the memory and it looks fine. I also turned of VCL
 Trace but that didn't solved it. The crash happened again today but a few
 hours later then usual. I tried to start varnishd in debug mode but I cant
 get it to work. When I set it to -d or -d -d it starts but no connection
 can be made against it. Any ideas?

 I forgot to mention that Im running varnish on a Virtual Server 2005 with
 512 MB RAM (150 MB free) and 10 GB diskspace.

 / Erik

 ___
 varnish-misc mailing list
 varnish-misc@projects.linpro.no
 http://projects.linpro.no/mailman/listinfo/varnish-misc
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


RE: Re: Varnish crashes everyday and same time

2007-11-22 Thread Erik
Hi,

This is what I do and what I get:
/etc/init.d/varnish start
Starting Varnish: varnishUsing old SHMFILE
rolling(2)...

It seems to me that the varnish is running? But when
trying to connect it doesn't work! Althought when I run
without -d -d or -d it works!

I would really like to commit some logdata from varnishd
but since I cant get the debug to work it has to wait :(

/ Erik

Original Message ---
you need to write start after starting varnishd in debug mode.

janis
a satisfied Varnishd user :)

On Thursday 22 November 2007 12:18, Erik wrote:
 Hi again,

 I made some logging of the memory and it looks fine. I also turned of VCL
 Trace but that didn't solved it. The crash happened again today but a few
 hours later then usual. I tried to start varnishd in debug mode but I cant
 get it to work. When I set it to -d or -d -d it starts but no connection
 can be made against it. Any ideas?

 I forgot to mention that Im running varnish on a Virtual Server 2005 with
 512 MB RAM (150 MB free) and 10 GB diskspace.

 / Erik

 ___
 varnish-misc mailing list
 varnish-misc@projects.linpro.no
 http://projects.linpro.no/mailman/listinfo/varnish-misc
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc

___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Varnish crashes everyday and same time

2007-11-21 Thread Erik
Hi,

Varnish crashes everyday, same time. This is what I got from the log files:

   12 SessionOpen  c xx.xx.xx.xx 31989
0 Debug  Acceptor is epoll
0 Error  CLI read 0 (errno=32)

I also found this thread in the mailing archive from July:
http://projects.linpro.no/pipermail/varnish-misc/2007-July/000670html

That is the last post on that subject, no answer was posted to Anup Shukla. I 
dont know if that is the same problem but it has the same error in one of the 
posts.

Im running varnish compiled from source on Debian 4.0 Etch. 

Im gonna start a logjob of the memory to see if thats its the problem.

/ Erik

___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc