I too have encountered varnish hanging many times .

On Wed, Mar 4, 2009 at 10:46 AM, Ross Brown <[email protected]> wrote:
> Hi all
>
> We are hoping to use Varnish for serving image content on our reasonably busy 
> auction site here in New Zealand, but are having an interesting problem 
> during testing.
>
> We are using latest Varnish (2.0.3) on Ubuntu 8.10 server (64-bit) and have 
> built two servers for testing - both are located in the same datacentre and 
> situated behind an F5 hardware load balancer. We want to keep all images 
> cached in RAM and are using Varnish with jemalloc to achieve this. For the 
> most part, Varnish is working well for us and performance is great.
>
> However, we have seen both our Varnish servers lock up at precisely the same 
> time and stop processing incoming HTTP requests until Varnishd is manually 
> restarted. This has happened twice and seems to occur at random - the last 
> time was after 5 days of uptime and a significant amount of processed traffic 
> (<1TB).
>
> When this problem happens, the backend is still reachable and happily serving 
> images. It is not a particularly busy period for us (600 requests/sec/Varnish 
> server - approx 350Mbps outbound each - we got up to nearly 3 times that 
> level without incident previously) but for some reason unknown to us, the 
> servers just suddenly stop processing requests and worker processes increase 
> dramatically.
>
> After the lockup happened last time, I tried firing up varnishlog and hitting 
> the server directly - my requests were not showing up at all. The *only* 
> entries in the varnish log were related to worker processes being killed over 
> time - no PINGs, PONGs, load balancer healthchecks or anything related to 
> 'normal' varnish activity. It's as if varnishd has completely locked up, but 
> we can't understand what causes both our varnish servers to exhibit this 
> behaviour at exactly the same time, nor why varnish does not detect it and 
> attempt a restart. After a restart, varnish is fine and behaves itself.
>
> There is nothing to indicate an error with the backend, nor anything in 
> syslog to indicate a Varnish problem. Pointers of any kind would be 
> appreciated :)
>
> Best regards
>
> Ross Brown
> Trade Me
> www.trademe.co.nz
>
> *** Startup Options (as per hints in wiki for caching millions of objects):
> -a 0.0.0.0:80 -f /usr/local/etc/default.net.vcl -T 0.0.0.0:8021 -t 86400 -h 
> classic,1200007 -p thread_pool_max=4000 -p thread_pools=4 -p 
> listen_depth=4096 -p lru_interval=3600 -p obj_workspace=4096 -s malloc,10G
>
> *** Running VCL:
> backend default {
>        .host = "10.10.10.10";
>        .port = "80";
> }
>
> sub vcl_recv {
>        # Don't cache objects requested with query string in URI.
>        # Needed for newsletter headers (openrate) and health checks.
>        if (req.url ~ "\?.*") {
>                pass;
>        }
>
>        # Force lookup if the request is a no-cache request from the client.
>        if (req.http.Cache-Control ~ "no-cache") {
>                unset req.http.Cache-Control;
>                lookup;
>        }
>
>        # By default, Varnish will not serve requests that come with a cookie 
> from its cache.
>        unset req.http.cookie;
>        unset req.http.authenticate;
>
>        # No action here, continue into default vcl_recv{}
> }
>
>
> ***Stats
>      458887  Client connections accepted
>   170714631  Client requests received
>   133012763  Cache hits
>        3715  Cache hits for pass
>    27646213  Cache misses
>    37700868  Backend connections success
>           0  Backend connections not attempted
>           0  Backend connections too many
>          40  Backend connections failures
>    37512808  Backend connections reuses
>    37514682  Backend connections recycles
>           0  Backend connections unused
>        1339  N struct srcaddr
>          16  N active struct srcaddr
>         756  N struct sess_mem
>          12  N struct sess
>      761152  N struct object
>      761243  N struct objecthead
>           0  N struct smf
>           0  N small free smf
>           0  N large free smf
>         322  N struct vbe_conn
>         345  N struct bereq
>          20  N worker threads
>        2331  N worker threads created
>           0  N worker threads not created
>           0  N worker threads limited
>           0  N queued work requests
>       35249  N overflowed work requests
>           0  N dropped work requests
>           1  N backends
>          44  N expired objects
>    26886639  N LRU nuked objects
>           0  N LRU saved objects
>    15847787  N LRU moved objects
>           0  N objects on deathrow
>           3  HTTP header overflows
>           0  Objects sent with sendfile
>   164595318  Objects sent with write
>           0  Objects overflowing workspace
>      458886  Total Sessions
>   170715215  Total Requests
>         306  Total pipe
>    10054413  Total pass
>    37700586  Total fetch
>  49458782160  Total header bytes
> 1151144727614  Total body bytes
>       89464  Session Closed
>           0  Session Pipeline
>           0  Session Read Ahead
>           0  Session Linger
>   170622902  Session herd
>  7875546129  SHM records
>   380705819  SHM writes
>         138  SHM flushes due to overflow
>      763205  SHM MTX contention
>        2889  SHM cycles through buffer
>           0  allocator requests
>           0  outstanding allocations
>           0  bytes allocated
>           0  bytes free
>   101839895  SMA allocator requests
>     1519005  SMA outstanding allocations
>  10736616112  SMA outstanding bytes
> 562900737623  SMA bytes allocated
> 552164121511  SMA bytes free
>          56  SMS allocator requests
>           0  SMS outstanding allocations
>           0  SMS outstanding bytes
>       25712  SMS bytes allocated
>       25712  SMS bytes freed
>    37700490  Backend requests made
>           3  N vcl total
>           3  N vcl available
>           0  N vcl discarded
>           1  N total active purges
>           1  N new purges added
>           0  N old purges deleted
>           0  N objects tested
>           0  N regexps tested against
>           0  N duplicate purges removed
>           0  HCB Lookups without lock
>           0  HCB Lookups with lock
>           0  HCB Inserts
>           0  Objects ESI parsed (unlock)
>           0  ESI parse errors (unlock)
>
>
>
> _______________________________________________
> varnish-misc mailing list
> [email protected]
> http://projects.linpro.no/mailman/listinfo/varnish-misc
>
_______________________________________________
varnish-misc mailing list
[email protected]
http://projects.linpro.no/mailman/listinfo/varnish-misc

Reply via email to