RE: Survey; how do you use Varnish?

2010-02-02 Thread Ross Brown
1) How many servers do you have running Varnish?

8 servers (2 sites x 4 servers), load balanced behind F5 GTM. We aim to be able 
to lose a site AND suffer a hardware failure and keep on truckin'. We could 
probably run on one or two servers at a push, but our backend would most likely 
explode before Varnish broke a sweat.
Each server is a quad core Xeon w/ 16G RAM. We have a fairly large working set.

(+1 Varnish server for Dev / test, which is a VM)

2) What sort of total load are you having? Mbit/s or hits per second are 
preferred metrics.

~900 req/sec at peak per Prod server / 60Mbps

3) What sort of site is it?
  *) Retail = online auctions

4) Do you use ESI?
No.

5) What features are you missing from Varnish. 
Varnishlog filtering language (or other enhancements in this area)
Dynamic stats counters
Large dataset performance improvements


-Original Message-
From: varnish-misc-boun...@projects.linpro.no 
[mailto:varnish-misc-boun...@projects.linpro.no] On Behalf Of Martin Boer
Sent: Tuesday, 2 February 2010 10:23 p.m.
To: Per Andreas Buer
Cc: varnish-misc@projects.linpro.no
Subject: Re: Survey; how do you use Varnish?

1) One active server. We have another one as hot standby.

2) 50Mbit, 200 requests/second max. Most of the time it's 10Mbit, 40 
requests/second which isn't much.

3) Internet touroperator.

4) Nope

5) Automatic refreshing of data without having the endusers have to wait 
for the response.

The reason we use varnish most is because our website has complex, 
timeconsuming queries to backend systems. The answers to these queries 
do vary several times per day but are still cachable. Of course varnish 
also helps te bring down the load on those backend systems but the main 
use is that varnish gives the endusers a lightning fast prerendered 
interactive experience which is a paradox. We like working paradoxes.

Something like 'refresh pages after object.prefetch seconds if at least 
someone requested that object the last object.ttl seconds' where 
object.ttl is larger then object.refresh. So an object might be 
prefetched even a couple of times without anyone being interested but 
will be removed from the cache eventually after object.ttl has expired.

Regards,
Martin Boer



Per Andreas Buer wrote:
 Hi list.

 I'm working for Redpill Linpro, you might have heard of us - we're the main 
 sponsor of Varnish development. We're a bit curious about how Varnish is 
 used, what features are used and what is missing. What does a typical 
 installation look like? The information you would choose to reveal to me 
 would be aggregated and deleted and I promise you I won't use it for any 
 sales activities or harass you in any way. We will pubish the result on this 
 list if the feedback is significant. If you have the time and would like to 
 help us please take some time and answer the questions in a direct mail to 
 me. Thanks.

 1) How many servers do you have running Varnish?

 2) What sort of total load are you having? Mbit/s or hits per second are 
 preferred metrics.

 3) What sort of site is it?
  *) Online media
  *) Cooperate website (ibm.com or similar)
  *) Retail
  *) Educational
  *) Social website

 4) Do you use ESI?

 5) What features are you missing from Varnish. Max three features, 
 prioritized. Please refer to 
 http://varnish-cache.org/wiki/PostTwoShoppingList for features. 



   

___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


RE: Strategies for splitting load across varnish instances? And avoiding single-point-of-failure?

2010-01-17 Thread Ross Brown
 So it is possible to start your Varnish with one VCL program, and have
 a small script change to another one some minutes later.

What would this small script look like? 

Sorry if it's a dumb question :)


___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


RE: Strategies for splitting load across varnish instances? And avoiding single-point-of-failure?

2010-01-17 Thread Ross Brown
I hadn't used varnishadm before. Looks useful.

Thanks!

-Original Message-
From: p...@critter.freebsd.dk [mailto:p...@critter.freebsd.dk] On Behalf Of 
Poul-Henning Kamp
Sent: Monday, 18 January 2010 9:38 a.m.
To: Ross Brown
Cc: varnish-misc@projects.linpro.no
Subject: Re: Strategies for splitting load across varnish instances? And 
avoiding single-point-of-failure? 

In message 1ff67d7369ed1a45832180c7c1109bca13e23e7...@tmmail0.trademe.local, 
Ross Brown writes:
 So it is possible to start your Varnish with one VCL program, and have
 a small script change to another one some minutes later.

What would this small script look like?=20

sleep 600
varnishadm vcl.load real_thing /usr/local/etc/varnish/real.vcl
varnishadm vcl.use real_thing

-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Segfault in libvarnishcompat.so.1.0.0, after upgrading to build 4131

2009-07-06 Thread Ross Brown
After upgrading to trunk (build 4131) last week, we are seeing an issue when 
the object cache (using malloc) becomes full. We are running a server with 16GB 
of RAM with the following startup options:

-s malloc,12G 
-a 0.0.0.0:80 
-T 0.0.0.0:8021 
-f /usr/local/etc/current.vcl 
-t 86400 
-h classic,42013 
-P /var/run/varnish.pid 
-p obj_workspace=4096 
-p sess_workspace=262144 
-p lru_interval=60 
-p sess_timeout=10 
-p shm_workspace=32768 
-p ping_interval=1 
-p thread_pools=4 
-p thread_pool_min=50 
-p thread_pool_max=4000 
-p cli_timeout=20

VCL is pretty basic, we normalise and only accept GET and HEAD requests. 

Plotting usage using Cacti, we see varnishd crash and restart when the object 
cache is full.

Example of an error occurring :
Jul  3 11:04:50 tmcache2 kernel: [68325.150385] varnishd[15155]: segfault at ff 
ip 7f1df03a4d06 sp 7f1dd44b6120 error 4 in 
libvarnishcompat.so.1.0.0[7f1df039e000+e000]
Jul  3 11:04:52 tmcache2 varnishd[2594]: Child (15130) not responding to ping, 
killing it.
Jul  3 11:04:52 tmcache2 varnishd[2594]: Child (15130) not responding to ping, 
killing it.
Jul  3 11:04:52 tmcache2 varnishd[2594]: Child (15130) died signal=11
Jul  3 11:04:52 tmcache2 varnishd[2594]: Child cleanup complete
Jul  3 11:04:52 tmcache2 varnishd[2594]: child (5066) Started
Jul  3 11:04:52 tmcache2 varnishd[2594]: Child (5066) said Closed fds: 3 4 5 8 
9 11 12
Jul  3 11:04:52 tmcache2 varnishd[2594]: Child (5066) said Child starts
Jul  3 11:04:52 tmcache2 varnishd[2594]: Child (5066) said Ready

This bug only occurs in build 4131, prior to this we were using build 4019 and 
didn't have this issue. 

Ross Brown
Trade Me Limited

___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Varnish hangs / requests time out

2009-03-03 Thread Ross Brown
Hi all

We are hoping to use Varnish for serving image content on our reasonably busy 
auction site here in New Zealand, but are having an interesting problem during 
testing.

We are using latest Varnish (2.0.3) on Ubuntu 8.10 server (64-bit) and have 
built two servers for testing - both are located in the same datacentre and 
situated behind an F5 hardware load balancer. We want to keep all images cached 
in RAM and are using Varnish with jemalloc to achieve this. For the most part, 
Varnish is working well for us and performance is great.

However, we have seen both our Varnish servers lock up at precisely the same 
time and stop processing incoming HTTP requests until Varnishd is manually 
restarted. This has happened twice and seems to occur at random - the last time 
was after 5 days of uptime and a significant amount of processed traffic (1TB).

When this problem happens, the backend is still reachable and happily serving 
images. It is not a particularly busy period for us (600 requests/sec/Varnish 
server - approx 350Mbps outbound each - we got up to nearly 3 times that level 
without incident previously) but for some reason unknown to us, the servers 
just suddenly stop processing requests and worker processes increase 
dramatically. 

After the lockup happened last time, I tried firing up varnishlog and hitting 
the server directly - my requests were not showing up at all. The *only* 
entries in the varnish log were related to worker processes being killed over 
time - no PINGs, PONGs, load balancer healthchecks or anything related to 
'normal' varnish activity. It's as if varnishd has completely locked up, but we 
can't understand what causes both our varnish servers to exhibit this behaviour 
at exactly the same time, nor why varnish does not detect it and attempt a 
restart. After a restart, varnish is fine and behaves itself.

There is nothing to indicate an error with the backend, nor anything in syslog 
to indicate a Varnish problem. Pointers of any kind would be appreciated :)

Best regards

Ross Brown
Trade Me
www.trademe.co.nz

*** Startup Options (as per hints in wiki for caching millions of objects):
-a 0.0.0.0:80 -f /usr/local/etc/default.net.vcl -T 0.0.0.0:8021 -t 86400 -h 
classic,127 -p thread_pool_max=4000 -p thread_pools=4 -p listen_depth=4096 
-p lru_interval=3600 -p obj_workspace=4096 -s malloc,10G

*** Running VCL:
backend default { 
.host = 10.10.10.10;
.port = 80;   
} 

sub vcl_recv {
# Don't cache objects requested with query string in URI.
# Needed for newsletter headers (openrate) and health checks.
if (req.url ~ \?.*) { 
pass;
}

# Force lookup if the request is a no-cache request from the client.
if (req.http.Cache-Control ~ no-cache) {
unset req.http.Cache-Control;
lookup;
}

# By default, Varnish will not serve requests that come with a cookie 
from its cache.
unset req.http.cookie;
unset req.http.authenticate;

# No action here, continue into default vcl_recv{}
}


***Stats
  458887  Client connections accepted
   170714631  Client requests received
   133012763  Cache hits
3715  Cache hits for pass
27646213  Cache misses
37700868  Backend connections success
   0  Backend connections not attempted
   0  Backend connections too many
  40  Backend connections failures
37512808  Backend connections reuses
37514682  Backend connections recycles
   0  Backend connections unused
1339  N struct srcaddr
  16  N active struct srcaddr
 756  N struct sess_mem
  12  N struct sess
  761152  N struct object
  761243  N struct objecthead
   0  N struct smf
   0  N small free smf
   0  N large free smf
 322  N struct vbe_conn
 345  N struct bereq
  20  N worker threads
2331  N worker threads created
   0  N worker threads not created
   0  N worker threads limited
   0  N queued work requests
   35249  N overflowed work requests
   0  N dropped work requests
   1  N backends
  44  N expired objects
26886639  N LRU nuked objects
   0  N LRU saved objects
15847787  N LRU moved objects
   0  N objects on deathrow
   3  HTTP header overflows
   0  Objects sent with sendfile
   164595318  Objects sent with write
   0  Objects overflowing workspace
  458886  Total Sessions
   170715215  Total Requests
 306  Total pipe
10054413  Total pass
37700586  Total fetch
 49458782160  Total header bytes
1151144727614  Total body bytes
   89464  Session Closed
   0  Session Pipeline
   0  Session Read Ahead
   0  Session Linger
   170622902  Session herd
  7875546129  SHM records
   380705819  SHM writes
 138  SHM flushes due