We have an environment that serves lots of small dynamicly backend generated 
image files. The total dataset is about 2TB but we're not looking to cache all 
of it, just ease the load on the backend machines. We have about 2000-2500 
hits/s in total today and we are running 3 apaches with mod_caucho as frontends.

We have installed varnish on the same servers as the apache frontends and 
configured them to use the local apache as backend. The machines are dual 
opterons with dualcore so 4 cores per server with 16GB of ram and we're running 
rhel 4.2.

This is our varnish setup:

user                 varnish (201)
group                varnish (201)
default_ttl          3600 [seconds]
thread_pools         1 [pools]
thread_pool_max      1000 [threads]
thread_pool_min      128 [threads]
thread_pool_timeout  60 [seconds]
overflow_max         100 [%]
rush_exponent        3 [requests per request]
sess_workspace       8192 [bytes]
obj_workspace        8192 [bytes]
sess_timeout         5 [seconds]
pipe_timeout         60 [seconds]
send_timeout         600 [seconds]
auto_restart         on [bool]
fetch_chunksize      128 [kilobytes]
vcl_trace            off [bool]
listen_address       ":80"
listen_depth         1024 [connections]
srcaddr_hash         1049 [buckets]
srcaddr_ttl          30 [seconds]
backend_http11       off [bool]
client_http11        off [bool]
cli_timeout          5 [seconds]
ping_interval        3 [seconds]
lru_interval         3600 [seconds]
cc_command           exec cc -fpic -shared -Wl,-x -o %o %s
max_restarts         4 [restarts]
max_esi_includes     5 [restarts]
cache_vbe_conns      off [bool]
cli_buffer           8192 [bytes]
diag_bitmap          0x0 [bitmap]

This is our startup command:

/opt/varnish/sbin/varnishd -a :80 -p lru_interval 3600 -f 
/opt/varnish/conf/default.vcl -T 127.0.0.1:6082 -t 3600 -w 128,1000,60 -u 
varnish -g varnish -s file,/srv/varnish/varnish_storage.bin,30G -P 
/var/run/varnish.pid

Varnish looks fine until it's had abour 1,5 million requests, then we can see 
the kswapd0 and kswapd1 start working and load average rises to about 200 and 
the machine gets totally unresponsive. Top shows a lot of cpu beeing spent on 
i/o waits and varnish child process restarts sometimes. In best case the 
process restarts and the server starts behaving within 5 minutes but sometimes 
varnish dies completely. One thing we have noticed is that the reserved memory 
for varnish keeps rising and when it crashes it is usually around 14G.

The varnish storage file is running on the same physical disk as the system and 
the swap, could that be the problem? Should varnish really allocate so much 
memory so that the system starts to swap to disk?

Any suggestions or comments are welcome.

Regards
Calle Korjus
_______________________________________________
varnish-misc mailing list
[email protected]
http://projects.linpro.no/mailman/listinfo/varnish-misc

Reply via email to