RE: Survey; how do you use Varnish?
1) How many servers do you have running Varnish? 8 servers (2 sites x 4 servers), load balanced behind F5 GTM. We aim to be able to lose a site AND suffer a hardware failure and keep on truckin'. We could probably run on one or two servers at a push, but our backend would most likely explode before Varnish broke a sweat. Each server is a quad core Xeon w/ 16G RAM. We have a fairly large working set. (+1 Varnish server for Dev / test, which is a VM) 2) What sort of total load are you having? Mbit/s or hits per second are preferred metrics. ~900 req/sec at peak per Prod server / 60Mbps 3) What sort of site is it? *) Retail = online auctions 4) Do you use ESI? No. 5) What features are you missing from Varnish. Varnishlog filtering language (or other enhancements in this area) Dynamic stats counters Large dataset performance improvements -Original Message- From: varnish-misc-boun...@projects.linpro.no [mailto:varnish-misc-boun...@projects.linpro.no] On Behalf Of Martin Boer Sent: Tuesday, 2 February 2010 10:23 p.m. To: Per Andreas Buer Cc: varnish-misc@projects.linpro.no Subject: Re: Survey; how do you use Varnish? 1) One active server. We have another one as hot standby. 2) 50Mbit, 200 requests/second max. Most of the time it's 10Mbit, 40 requests/second which isn't much. 3) Internet touroperator. 4) Nope 5) Automatic refreshing of data without having the endusers have to wait for the response. The reason we use varnish most is because our website has complex, timeconsuming queries to backend systems. The answers to these queries do vary several times per day but are still cachable. Of course varnish also helps te bring down the load on those backend systems but the main use is that varnish gives the endusers a lightning fast prerendered interactive experience which is a paradox. We like working paradoxes. Something like 'refresh pages after object.prefetch seconds if at least someone requested that object the last object.ttl seconds' where object.ttl is larger then object.refresh. So an object might be prefetched even a couple of times without anyone being interested but will be removed from the cache eventually after object.ttl has expired. Regards, Martin Boer Per Andreas Buer wrote: Hi list. I'm working for Redpill Linpro, you might have heard of us - we're the main sponsor of Varnish development. We're a bit curious about how Varnish is used, what features are used and what is missing. What does a typical installation look like? The information you would choose to reveal to me would be aggregated and deleted and I promise you I won't use it for any sales activities or harass you in any way. We will pubish the result on this list if the feedback is significant. If you have the time and would like to help us please take some time and answer the questions in a direct mail to me. Thanks. 1) How many servers do you have running Varnish? 2) What sort of total load are you having? Mbit/s or hits per second are preferred metrics. 3) What sort of site is it? *) Online media *) Cooperate website (ibm.com or similar) *) Retail *) Educational *) Social website 4) Do you use ESI? 5) What features are you missing from Varnish. Max three features, prioritized. Please refer to http://varnish-cache.org/wiki/PostTwoShoppingList for features. ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
RE: Strategies for splitting load across varnish instances? And avoiding single-point-of-failure?
So it is possible to start your Varnish with one VCL program, and have a small script change to another one some minutes later. What would this small script look like? Sorry if it's a dumb question :) ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
RE: Strategies for splitting load across varnish instances? And avoiding single-point-of-failure?
I hadn't used varnishadm before. Looks useful. Thanks! -Original Message- From: p...@critter.freebsd.dk [mailto:p...@critter.freebsd.dk] On Behalf Of Poul-Henning Kamp Sent: Monday, 18 January 2010 9:38 a.m. To: Ross Brown Cc: varnish-misc@projects.linpro.no Subject: Re: Strategies for splitting load across varnish instances? And avoiding single-point-of-failure? In message 1ff67d7369ed1a45832180c7c1109bca13e23e7...@tmmail0.trademe.local, Ross Brown writes: So it is possible to start your Varnish with one VCL program, and have a small script change to another one some minutes later. What would this small script look like?=20 sleep 600 varnishadm vcl.load real_thing /usr/local/etc/varnish/real.vcl varnishadm vcl.use real_thing -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 p...@freebsd.org | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Segfault in libvarnishcompat.so.1.0.0, after upgrading to build 4131
After upgrading to trunk (build 4131) last week, we are seeing an issue when the object cache (using malloc) becomes full. We are running a server with 16GB of RAM with the following startup options: -s malloc,12G -a 0.0.0.0:80 -T 0.0.0.0:8021 -f /usr/local/etc/current.vcl -t 86400 -h classic,42013 -P /var/run/varnish.pid -p obj_workspace=4096 -p sess_workspace=262144 -p lru_interval=60 -p sess_timeout=10 -p shm_workspace=32768 -p ping_interval=1 -p thread_pools=4 -p thread_pool_min=50 -p thread_pool_max=4000 -p cli_timeout=20 VCL is pretty basic, we normalise and only accept GET and HEAD requests. Plotting usage using Cacti, we see varnishd crash and restart when the object cache is full. Example of an error occurring : Jul 3 11:04:50 tmcache2 kernel: [68325.150385] varnishd[15155]: segfault at ff ip 7f1df03a4d06 sp 7f1dd44b6120 error 4 in libvarnishcompat.so.1.0.0[7f1df039e000+e000] Jul 3 11:04:52 tmcache2 varnishd[2594]: Child (15130) not responding to ping, killing it. Jul 3 11:04:52 tmcache2 varnishd[2594]: Child (15130) not responding to ping, killing it. Jul 3 11:04:52 tmcache2 varnishd[2594]: Child (15130) died signal=11 Jul 3 11:04:52 tmcache2 varnishd[2594]: Child cleanup complete Jul 3 11:04:52 tmcache2 varnishd[2594]: child (5066) Started Jul 3 11:04:52 tmcache2 varnishd[2594]: Child (5066) said Closed fds: 3 4 5 8 9 11 12 Jul 3 11:04:52 tmcache2 varnishd[2594]: Child (5066) said Child starts Jul 3 11:04:52 tmcache2 varnishd[2594]: Child (5066) said Ready This bug only occurs in build 4131, prior to this we were using build 4019 and didn't have this issue. Ross Brown Trade Me Limited ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Varnish hangs / requests time out
Hi all We are hoping to use Varnish for serving image content on our reasonably busy auction site here in New Zealand, but are having an interesting problem during testing. We are using latest Varnish (2.0.3) on Ubuntu 8.10 server (64-bit) and have built two servers for testing - both are located in the same datacentre and situated behind an F5 hardware load balancer. We want to keep all images cached in RAM and are using Varnish with jemalloc to achieve this. For the most part, Varnish is working well for us and performance is great. However, we have seen both our Varnish servers lock up at precisely the same time and stop processing incoming HTTP requests until Varnishd is manually restarted. This has happened twice and seems to occur at random - the last time was after 5 days of uptime and a significant amount of processed traffic (1TB). When this problem happens, the backend is still reachable and happily serving images. It is not a particularly busy period for us (600 requests/sec/Varnish server - approx 350Mbps outbound each - we got up to nearly 3 times that level without incident previously) but for some reason unknown to us, the servers just suddenly stop processing requests and worker processes increase dramatically. After the lockup happened last time, I tried firing up varnishlog and hitting the server directly - my requests were not showing up at all. The *only* entries in the varnish log were related to worker processes being killed over time - no PINGs, PONGs, load balancer healthchecks or anything related to 'normal' varnish activity. It's as if varnishd has completely locked up, but we can't understand what causes both our varnish servers to exhibit this behaviour at exactly the same time, nor why varnish does not detect it and attempt a restart. After a restart, varnish is fine and behaves itself. There is nothing to indicate an error with the backend, nor anything in syslog to indicate a Varnish problem. Pointers of any kind would be appreciated :) Best regards Ross Brown Trade Me www.trademe.co.nz *** Startup Options (as per hints in wiki for caching millions of objects): -a 0.0.0.0:80 -f /usr/local/etc/default.net.vcl -T 0.0.0.0:8021 -t 86400 -h classic,127 -p thread_pool_max=4000 -p thread_pools=4 -p listen_depth=4096 -p lru_interval=3600 -p obj_workspace=4096 -s malloc,10G *** Running VCL: backend default { .host = 10.10.10.10; .port = 80; } sub vcl_recv { # Don't cache objects requested with query string in URI. # Needed for newsletter headers (openrate) and health checks. if (req.url ~ \?.*) { pass; } # Force lookup if the request is a no-cache request from the client. if (req.http.Cache-Control ~ no-cache) { unset req.http.Cache-Control; lookup; } # By default, Varnish will not serve requests that come with a cookie from its cache. unset req.http.cookie; unset req.http.authenticate; # No action here, continue into default vcl_recv{} } ***Stats 458887 Client connections accepted 170714631 Client requests received 133012763 Cache hits 3715 Cache hits for pass 27646213 Cache misses 37700868 Backend connections success 0 Backend connections not attempted 0 Backend connections too many 40 Backend connections failures 37512808 Backend connections reuses 37514682 Backend connections recycles 0 Backend connections unused 1339 N struct srcaddr 16 N active struct srcaddr 756 N struct sess_mem 12 N struct sess 761152 N struct object 761243 N struct objecthead 0 N struct smf 0 N small free smf 0 N large free smf 322 N struct vbe_conn 345 N struct bereq 20 N worker threads 2331 N worker threads created 0 N worker threads not created 0 N worker threads limited 0 N queued work requests 35249 N overflowed work requests 0 N dropped work requests 1 N backends 44 N expired objects 26886639 N LRU nuked objects 0 N LRU saved objects 15847787 N LRU moved objects 0 N objects on deathrow 3 HTTP header overflows 0 Objects sent with sendfile 164595318 Objects sent with write 0 Objects overflowing workspace 458886 Total Sessions 170715215 Total Requests 306 Total pipe 10054413 Total pass 37700586 Total fetch 49458782160 Total header bytes 1151144727614 Total body bytes 89464 Session Closed 0 Session Pipeline 0 Session Read Ahead 0 Session Linger 170622902 Session herd 7875546129 SHM records 380705819 SHM writes 138 SHM flushes due