Varnish User Group meeting 2 [VUG2] 29+30/03/2010
Hi all, On March the 29th and 30th the second Varnish User Group meeting will be held in the Bay / Marktplaats.nl offices in Amsterdam. See http://varnish-cache.org/wiki/VUG2 for all information and how to sign up. We have a max of 15 open seats, please sign up if you are actually attending. However be quick to sign-up, the seats might be taken quickly. Regards, Marco -- Terantula - Industrial Strength Open Source phone:+31 64 3232 400 / www: http://www.terantula.com / pgpkey: E7EE7A46 pgp fingerprint: F2EE 122D 964C DE68 7380 6F95 3710 7719 E7EE 7A46 ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish restarts when all memory is allocated
On Tue, May 26, 2009 at 11:29:08PM +0200, Marco Walraven wrote: Hi, We are testing a Varnish Cache in our production environment with a 500Gb storage file and 32Gb of RAM. Varnish performance is excellent when all of the 32Gb is not allocated yet. The rates I am seeing here are around 40-60Mbit/s, with roughly 2.2M objects in cache and hitting a ratio of ~0.65, even then Varnish can handle it easily. However it is still warming up since we have a lot of objects that need to be cached. The problem I am facing is that as soon as RAM is exhausted Varnish restarts itself. In the meantime I have been doing some tests tweaking the VM system under Linux, especially vm.min_free_kbytes, leaving some memory for pdflush and kswapd. The results are slightly better, but still varnishd starts to hog the CPU's and restarts. Alternatively we disabled swap, running without it. But also tested with a swap file of 16Gb on a different disk. Again slightly better results but still the same effect in the end. We also ran Varnish without the file storage type having just 8Gb assigned with malloc, this ran longer than the other tests we did. Varnishd did not crash but got extremely high CPU usages 700% and recoverd from that after a minute or 2. The linux system run with the following sysctl config applied: Linux varnish001 2.6.18-6-amd64 #1 SMP Tue May 5 08:01:28 UTC 2009 x86_64 GNU/Linux /etc/systctl.conf net.ipv4.ip_local_port_range = 1024 65536 net.core.rmem_max=16777216 net.core.wmem_max=16777216 net.ipv4.tcp_rmem=4096 87380 16777216 net.ipv4.tcp_wmem=4096 65536 16777216 net.ipv4.tcp_fin_timeout = 3 net.ipv4.tcp_tw_recycle = 1 net.core.netdev_max_backlog = 3 net.ipv4.tcp_no_metrics_save=1 net.core.somaxconn = 262144 net.ipv4.tcp_syncookies = 0 net.ipv4.tcp_max_orphans = 262144 net.ipv4.tcp_max_syn_backlog = 262144 net.ipv4.tcp_synack_retries = 2 net.ipv4.tcp_syn_retries = 2 vm.swappiness = 0 vm.min_free_kbytes = 4194304 vm.dirty_background_ratio = 25 vm.dirty_expire_centisecs = 1000 vm.dirty_writeback_centisecs = 100 So, yesterday I installed FreeBSD 7.2 STABLE with the lastest CVSup on the second Varnish box and ran Varnish with the exact same config as on the Linux box. Exact same setup, 16 Gb swap file, same arguments for varnishd, same vcl, same amount of traffic, connections etc etc. I did apply perfmance tuning as described on the wiki. Both systems ran Ok until the moment there was little RAM left. Linux showed the exact same behaviour as before, high CPU load, varnishd with high amounts of CPU usages and in the end it varnishd restarted with an ampty cache. FreeBSD kept on going as I expected it to work; however with a higher load but still serving images at 60Mbit/s. I did see that it sometimes needed to recover. Meaning accepting no connections for a few moments and then starting to go on again, but enough to notice. Both systems run varnishd as followed, I changed the amount of buckets to 450011 as opposed to the previous tests I ran. Same for the lru_interval which was 60 and maybe too low. /usr/sbin/varnishd -P /var/run/varnishd.pid -a :80 -f /etc/varnish/default.vcl -T 127.0.0.1:6082 -t 3600 -w 400,4000,60 -s file,500G -p obj_workspace 8192 -p sess_workspace 262144 -p lru_interval 600 -h classic,450011 -p sess_timeout 2 -p listen_depth 8192 -p log_hashstring off -p shm_workspace 32768 -p ping_interval 10 -p srcaddr_ttl 0 -p esi_syntax 1 Below some output of Linux when it started to hog the CPU and output of the FreeBSD system 15 minutes later when it was still going. So is this kind of setup actually possible ? And if so how to get it running smoothly ? So far FreeBSD comes pretty close but not yet there. Thanks for the help, Marco Linux: Hitrate ratio: 10 100 1000 Hitrate avg: 0.6914 0.6843 0.6656 5739 0.00 0.60 Client connections accepted 5197130 0.00 542.72 Client requests received 2929389 0.00 305.91 Cache hits 0 0.00 0.00 Cache hits for pass 2267089 0.00 236.75 Cache misses 2267116 0.00 236.75 Backend connections success 0 0.00 0.00 Backend connections failures 2246281 0.00 234.57 Backend connections reuses 2246300 1.00 234.58 Backend connections recycles 120 .. N struct sess_mem 57 .. N struct sess 2224133 .. N struct object 930 .. N struct objecthead 4448208 .. N struct smf 0 .. N small free smf 0 .. N large free smf 33 .. N struct vbe_conn 114 .. N struct bereq 400 .. N worker threads 400
Re: Varnish restarts when all memory is allocated
On Wed, May 27, 2009 at 10:31:30AM +0200, Kristian Lyngstol wrote: Can you post the arguments you use to start varnish? Sure; Varnish runs currently as follows: 5117 ?Ss 0:00 /usr/sbin/varnishd -P /var/run/varnishd.pid -a :80 -f /etc/varnish/default.vcl -T 127.0.0.1:6082 -t 3600 -w 400,4000,60 -s file,/data/varnish/mp-varnish001/varnish_storage.bin,500G -p obj_workspace 4096 -p sess_workspace 262144 -p lru_interval 60 -h classic,4550111 -p listen_depth 8192 -p log_hashstring off -p sess_timeout 10 -p shm_workspace 32768 -p ping_interval 1 -p thread_pools 4 -p thread_pool_min 100 -p thread_pool_max 4000 -p srcaddr_ttl 0 -p esi_syntax 1 5118 ?Sl 0:01 /usr/sbin/varnishd -P /var/run/varnishd.pid -a :80 -f /etc/varnish/default.vcl -T 127.0.0.1:6082 -t 3600 -w 400,4000,60 -s file,/data/varnish/mp-varnish001/varnish_storage.bin,500G -p obj_workspace 4096 -p sess_workspace 262144 -p lru_interval 60 -h classic,4550111 -p listen_depth 8192 -p log_hashstring off -p sess_timeout 10 -p shm_workspace 32768 -p ping_interval 1 -p thread_pools 4 -p thread_pool_min 100 -p thread_pool_max 4000 -p srcaddr_ttl 0 -p esi_syntax 1 I have to note that I have been running with a lru_interval of 3600 which had the same effect e.g. restarting when it hits the memory limit. Marco -- Terantula - Industrial Strength Open Source phone:+31 64 3232 400 / www: http://www.terantula.com / pgpkey: E7EE7A46 pgp fingerprint: F2EE 122D 964C DE68 7380 6F95 3710 7719 E7EE 7A46 ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Varnish restarts when all memory is allocated
Hi, We are testing a Varnish Cache in our production environment with a 500Gb storage file and 32Gb of RAM. Varnish performance is excellent when all of the 32Gb is not allocated yet. The rates I am seeing here are around 40-60Mbit/s, with roughly 2.2M objects in cache and hitting a ratio of ~0.65, even then Varnish can handle it easily. However it is still warming up since we have a lot of objects that need to be cached. The problem I am facing is that as soon as RAM is exhausted Varnish restarts itself. Since this looked like an IO problem, we dropped ext2 in favour of xfs with much better results on writing to disk. However varnishd still stops working after it get to the 32G RAM limit. Note that I don't see any IO until just before it hits the 97% of RAM usage. So we thought to combine the file storage type with malloc and limit the amount of memory Varnish is allowed to allocate, first to 5G and see how that would work out. It turned out that it did not get limited and it seems from reading some posts this is not needed.. I have seen some posts on running large caches with the same kind but not a real approach to a solution. What is the best way to get around this issue ? Below are the init script and output of both varnisstat and top. Hitrate ratio:333 Hitrate avg: 0.6008 0.6008 0.6008 10871 1.00 1.17 Client connections accepted 5278218 273.99 566.76 Client requests received 2864011 172.99 307.53 Cache hits 2413896 101.00 259.20 Cache misses 2413920 101.00 259.20 Backend connections success 239174999.00 256.82 Backend connections reuses 239179599.00 256.82 Backend connections recycles 148 .. N struct sess_mem 29 .. N struct sess 2366595 .. N struct object 2364206 .. N struct objecthead 4733079 .. N struct smf 0 .. N small free smf 1 .. N large free smf 10 .. N struct vbe_conn 96 .. N struct bereq 400 .. N worker threads 400 0.00 0.04 N worker threads created 2 .. N backends 47353 .. N expired objects 2090535 .. N LRU moved objects 5086915 265.99 546.22 Objects sent with write 10867 1.00 1.17 Total Sessions 5278227 273.99 566.76 Total Requests 12 0.00 0.00 Total pipe 13 0.00 0.00 Total pass 2413900 101.00 259.20 Total fetch 1865669893 97172.71200329.64 Total header bytes 22763257823 1297006.09 2444245.44 Total body bytes 3335 0.00 0.36 Session Closed 5275957 273.99 566.52 Session herd 292178030 14367.51 31373.14 SHM records 7758036 382.99 833.03 SHM writes 6264 2.00 0.67 SHM flushes due to overflow 239 0.00 0.03 SHM MTX contention 125 0.00 0.01 SHM cycles through buffer 4828098 201.99 518.43 allocator requests 4733078 .. outstanding allocations 30790995968 .. bytes allocated 506079916032 .. bytes free 303 0.00 0.03 SMS allocator requests 130986 .. SMS bytes allocated 130986 .. SMS bytes freed 2413909 101.00 259.20 Backend requests made 1 0.00 0.00 N vcl total 1 0.00 0.00 N vcl available 1 .. N total active purges 1 0.00 0.00 N new purges added top - 15:13:40 up 7 days, 33 min, 2 users, load average: 0.14, 0.71, 0.75 Tasks: 116 total, 1 running, 115 sleeping, 0 stopped, 0 zombie Cpu0 : 3.0%us, 1.0%sy, 0.0%ni, 93.0%id, 1.7%wa, 0.0%hi, 1.3%si, 0.0%st Cpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu2 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu3 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu4 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu5 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu6 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu7 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 32942712k total, 32777060k used, 165652k free, 2164k buffers Swap: 506008k total,25664k