On Tue, May 26, 2009 at 11:29:08PM +0200, Marco Walraven wrote: > Hi, > > We are testing a Varnish Cache in our production environment with a 500Gb > storage file and > 32Gb of RAM. Varnish performance is excellent when all of the 32Gb is not > allocated yet. > The rates I am seeing here are around 40-60Mbit/s, with roughly 2.2M objects > in cache and > hitting a ratio of ~0.65, even then Varnish can handle it easily. However it > is still > warming up since we have a lot of objects that need to be cached. > > The problem I am facing is that as soon as RAM is exhausted Varnish restarts > itself.
In the meantime I have been doing some tests tweaking the VM system under Linux, especially vm.min_free_kbytes, leaving some memory for pdflush and kswapd. The results are slightly better, but still varnishd starts to hog the CPU's and restarts. Alternatively we disabled swap, running without it. But also tested with a swap file of 16Gb on a different disk. Again slightly better results but still the same effect in the end. We also ran Varnish without the file storage type having just 8Gb assigned with malloc, this ran longer than the other tests we did. Varnishd did not crash but got extremely high CPU usages 700% and recoverd from that after a minute or 2. The linux system run with the following sysctl config applied: Linux varnish001 2.6.18-6-amd64 #1 SMP Tue May 5 08:01:28 UTC 2009 x86_64 GNU/Linux /etc/systctl.conf net.ipv4.ip_local_port_range = 1024 65536 net.core.rmem_max=16777216 net.core.wmem_max=16777216 net.ipv4.tcp_rmem=4096 87380 16777216 net.ipv4.tcp_wmem=4096 65536 16777216 net.ipv4.tcp_fin_timeout = 3 net.ipv4.tcp_tw_recycle = 1 net.core.netdev_max_backlog = 30000 net.ipv4.tcp_no_metrics_save=1 net.core.somaxconn = 262144 net.ipv4.tcp_syncookies = 0 net.ipv4.tcp_max_orphans = 262144 net.ipv4.tcp_max_syn_backlog = 262144 net.ipv4.tcp_synack_retries = 2 net.ipv4.tcp_syn_retries = 2 vm.swappiness = 0 vm.min_free_kbytes = 4194304 vm.dirty_background_ratio = 25 vm.dirty_expire_centisecs = 1000 vm.dirty_writeback_centisecs = 100 So, yesterday I installed FreeBSD 7.2 STABLE with the lastest CVSup on the second Varnish box and ran Varnish with the exact same config as on the Linux box. Exact same setup, 16 Gb swap file, same arguments for varnishd, same vcl, same amount of traffic, connections etc etc. I did apply perfmance tuning as described on the wiki. Both systems ran Ok until the moment there was little RAM left. Linux showed the exact same behaviour as before, high CPU load, varnishd with high amounts of CPU usages and in the end it varnishd restarted with an ampty cache. FreeBSD kept on going as I expected it to work; however with a higher load but still serving images at 60Mbit/s. I did see that it sometimes needed to recover. Meaning accepting no connections for a few moments and then starting to go on again, but enough to notice. Both systems run varnishd as followed, I changed the amount of buckets to 450011 as opposed to the previous tests I ran. Same for the lru_interval which was 60 and maybe too low. /usr/sbin/varnishd -P /var/run/varnishd.pid -a :80 -f /etc/varnish/default.vcl -T 127.0.0.1:6082 -t 3600 -w 400,4000,60 -s file,500G -p obj_workspace 8192 -p sess_workspace 262144 -p lru_interval 600 -h classic,450011 -p sess_timeout 2 -p listen_depth 8192 -p log_hashstring off -p shm_workspace 32768 -p ping_interval 10 -p srcaddr_ttl 0 -p esi_syntax 1 Below some output of Linux when it started to hog the CPU and output of the FreeBSD system 15 minutes later when it was still going. So is this kind of setup actually possible ? And if so how to get it running smoothly ? So far FreeBSD comes pretty close but not yet there. Thanks for the help, Marco Linux: Hitrate ratio: 10 100 1000 Hitrate avg: 0.6914 0.6843 0.6656 5739 0.00 0.60 Client connections accepted 5197130 0.00 542.72 Client requests received 2929389 0.00 305.91 Cache hits 0 0.00 0.00 Cache hits for pass 2267089 0.00 236.75 Cache misses 2267116 0.00 236.75 Backend connections success 0 0.00 0.00 Backend connections failures 2246281 0.00 234.57 Backend connections reuses 2246300 1.00 234.58 Backend connections recycles 120 . . N struct sess_mem 57 . . N struct sess 2224133 . . N struct object 2222930 . . N struct objecthead 4448208 . . N struct smf 0 . . N small free smf 0 . . N large free smf 33 . . N struct vbe_conn 114 . . N struct bereq 400 . . N worker threads 400 0.00 0.04 N worker threads created 1601 0.00 0.17 N worker threads limited 0 0.00 0.00 N queued work requests 0 0.00 0.00 N overflowed work requests 2 . . N backends 42991 . . N expired objects 1347513 . . N LRU moved objects 5081074 1.00 530.61 Objects sent with write 5725 0.00 0.60 Total Sessions 5197083 1.00 542.72 Total Requests 6 0.00 0.00 Total pipe 21 0.00 0.00 Total pass 2267086 1.00 236.75 Total fetch 1859166711 366.91 194148.57 Total header bytes 21933375988 802.81 2290452.80 Total body bytes 2363 1.00 0.25 Session Closed 5194871 0.00 542.49 Session herd 283299668 45.99 29584.34 SHM records 7496210 3.00 782.81 SHM writes 5065 0.00 0.53 SHM flushes due to overflow 142 0.00 0.01 SHM MTX contention 122 0.00 0.01 SHM cycles through buffer 4534821 0.00 473.56 allocator requests 4448207 . . outstanding allocations 37913182208 . . bytes allocated 498957729792 . . bytes free 605 0.00 0.06 SMS allocator requests 261420 . . SMS bytes allocated 261420 . . SMS bytes freed 2267110 0.00 236.75 Backend requests made 1 0.00 0.00 N vcl total 1 0.00 0.00 N vcl available 1 . . N total active purges 1 0.00 0.00 N new purges added top - 18:56:21 up 1 day, 1:41, 3 users, load average: 10.72, 2.88, 1.34 Tasks: 121 total, 5 running, 116 sleeping, 0 stopped, 0 zombie Cpu0 : 0.7%us, 87.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 12.3%si, 0.0%st Cpu1 : 0.3%us, 99.3%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st Cpu2 : 0.0%us,100.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu3 : 0.0%us,100.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu4 : 0.3%us, 99.7%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu5 : 0.0%us,100.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu6 : 0.0%us,100.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu7 : 0.0%us,100.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 32942712k total, 30712396k used, 2230316k free, 308k buffers Swap: 16777208k total, 10692k used, 16766516k free, 28230576k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 11740 nobody 18 0 505g 28g 26g S 753 91.8 17:36.41 varnishd 249 root 20 -5 0 0 0 R 46 0.0 13:32.07 kswapd0 12564 root 15 0 6628 1208 864 R 0 0.0 0:04.45 top 1 root 15 0 6120 80 48 S 0 0.0 0:14.35 init 2 root RT 0 0 0 0 S 0 0.0 0:00.01 migration/0 3 root 34 19 0 0 0 S 0 0.0 0:00.00 ksoftirqd/0 FreeBSD: last pid: 12964; load averages: 2.11, 1.63, 0.94 up 0+04:15:40 17:09:45 487 processes: 9 running, 462 sleeping, 16 waiting CPU 0: 2.0% user, 0.0% nice, 25.5% system, 0.7% interrupt, 71.8% idle CPU 1: 0.7% user, 0.0% nice, 27.5% system, 0.0% interrupt, 71.8% idle CPU 2: 1.3% user, 0.0% nice, 65.1% system, 0.0% interrupt, 33.6% idle CPU 3: 1.3% user, 0.0% nice, 36.9% system, 0.0% interrupt, 61.7% idle CPU 4: 0.7% user, 0.0% nice, 12.8% system, 0.7% interrupt, 85.8% idle CPU 5: 0.0% user, 0.0% nice, 0.7% system, 0.0% interrupt, 99.3% idle CPU 6: 0.0% user, 0.0% nice, 2.7% system, 0.0% interrupt, 97.3% idle CPU 7: 0.7% user, 0.0% nice, 0.7% system, 0.0% interrupt, 98.6% idle Mem: 27G Active, 2865M Inact, 727M Wired, 884M Cache, 399M Buf, 120K Free Swap: 16G Total, 1476K Used, 16G Free PID UID THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 12 0 1 171 ki31 0K 16K CPU6 6 252:38 89.99% idle: cpu6 11 0 1 171 ki31 0K 16K CPU7 7 252:34 88.57% idle: cpu7 14 0 1 171 ki31 0K 16K CPU4 4 252:21 88.18% idle: cpu4 13 0 1 171 ki31 0K 16K CPU5 5 252:30 83.06% idle: cpu5 45 0 1 -16 - 0K 16K psleep 1 5:18 81.88% pagedaemon 15 0 1 171 ki31 0K 16K CPU3 3 250:19 75.88% idle: cpu3 18 0 1 171 ki31 0K 16K CPU0 0 242:32 73.00% idle: cpu0 16 0 1 171 ki31 0K 16K RUN 2 245:26 70.90% idle: cpu2 17 0 1 171 ki31 0K 16K CPU1 1 243:08 60.89% idle: cpu1 12033 80 406 45 0 502G 24339M ucond 4 0:08 13.96% varnishd Hitrate ratio: 10 100 1000 Hitrate avg: 0.6868 0.6891 0.6749 15694 2.00 1.51 Client connections accepted 5611010 651.39 540.72 Client requests received 3211214 451.89 309.45 Cache hits 1 0.00 0.00 Cache hits for pass 2399137 198.51 231.20 Cache misses 2399143 198.51 231.20 Backend connections success 2377367 198.51 229.10 Backend connections reuses 2377408 197.51 229.10 Backend connections recycles 0 . . N struct srcaddr 0 . . N active struct srcaddr 316 . . N struct sess_mem 48 . . N struct sess 2349612 . . N struct object 2348460 . . N struct objecthead 4699173 . . N struct smf 0 . . N small free smf 0 . . N large free smf 40 . . N struct vbe_conn 115 . . N struct bereq 400 . . N worker threads 400 0.00 0.04 N worker threads created 0 0.00 0.00 N worker threads limited 0 0.00 0.00 N queued work requests 0 0.00 0.00 N overflowed work requests 2 . . N backends 49564 . . N expired objects 1516192 . . N LRU moved objects 5486698 643.41 528.74 Objects sent with write 15693 1.00 1.51 Total Sessions 5611037 648.40 540.72 Total Requests 8 0.00 0.00 Total pipe 20 0.00 0.00 Total pass 2399127 198.51 231.20 Total fetch 2007812574 234091.74 193486.80 Total header bytes 23594716103 2781420.12 2273751.19 Total body bytes 2576 0.00 0.25 Session Closed 5608555 647.40 540.48 Session herd 304052678 32276.41 29300.63 SHM records 8074243 868.86 778.09 SHM writes 6122 2.99 0.59 SHM flushes due to overflow 11697 5.99 1.13 SHM MTX contention 131 0.00 0.01 SHM cycles through buffer 4798894 396.02 462.45 allocator requests 4699172 . . outstanding allocations 20 0.00 0.00 Total pass 2399127 198.51 231.20 Total fetch 2007812574 234091.74 193486.80 Total header bytes 23594716103 2781420.12 2273751.19 Total body bytes 2576 0.00 0.25 Session Closed 5608555 647.40 540.48 Session herd 304052678 32276.41 29300.63 SHM records 8074243 868.86 778.09 SHM writes 6122 2.99 0.59 SHM flushes due to overflow 11697 5.99 1.13 SHM MTX contention 131 0.00 0.01 SHM cycles through buffer 4798894 396.02 462.45 allocator requests 4699172 . . outstanding allocations 40178368512 . . bytes allocated 496692543488 . . bytes free 575 0.00 0.06 SMS allocator requests 0 . . SMS outstanding allocations 249035 . . SMS bytes allocated 249035 . . SMS bytes freed 2399128 198.51 231.20 Backend requests made 1 0.00 0.00 N vcl total 1 0.00 0.00 N vcl available 1 . . N total active purges 1 0.00 0.00 N new purges added -- Terantula - Industrial Strength Open Source phone:+31 64 3232 400 / www: http://www.terantula.com / pgpkey: E7EE7A46 pgp fingerprint: F2EE 122D 964C DE68 7380 6F95 3710 7719 E7EE 7A46 apt-get install freebsd _______________________________________________ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc