Re: strange restart, taking 15 minutes
Am Samstag 07 März 2009 18:52:04 schrieb Sascha Ottolski: I've just seen a strange restart that heppened this morning, after only about 25 h runtime; please ignore the posting, it just came out that the server has a broken hdd. Thanks for listening anyway, Sascha ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Panic message: Assert error in exp_timer(), cache_expire.c line 303
Am Mittwoch 28 Januar 2009 10:52:47 schrieb Poul-Henning Kamp: In message 200901260917.45223.ottol...@web.de, Sascha Ottolski writes: Assert error in exp_timer(), cache_expire.c line 303: Condition(oe2-timer_when = oe-timer_when) not true. thread = (cache-timeout) [...] It all happens with trunk, r3497. Fixed in r3547 thanks very much. I upgraded to r3563, and now had another crash (after approx. 54 hours runtime): Feb 1 19:54:21 localhost varnishd[3146]: Child (3147) not responding to ping, killing it. Feb 1 19:54:22 localhost varnishd[3146]: Child (3147) died signal=6 Feb 1 19:54:22 localhost varnishd[3146]: Child (3147) Panic message: Assert error in EXP_Rearm(), cache_expire.c line 255: Condition(oe-timer_idx != BINHEAP_NOIDX) not true. thread = (cache-worker)sp = 0x2b1b405ac008 { fd = 3797, id = 3797, xid = 2021543483, client = 217.234.68.98:1869, step = STP_HIT, handling = error, err_code = 200, err_reason = (null), ws = 0x2b1b405ac078 { id = sess, {s,f,r,e} = {0x2b1b405ac808,,+938,(nil),+16384}, }, worker = 0x59831c80 { }, vcl = { srcname = { input, Default, }, }, obj = 0x2ab1ab8b8000 { refcnt = 2, xid = 1987221953, ws = 0x2ab1ab8b8028 {id = obj, {s,f,r,e} = {0x2ab1ab8b8358,, +241,(nil),+3240}, }, http = { ws = 0x2ab1ab8b8028 { id = obj, {s,f,r,e} = {0x2ab1ab8b8358,,+241,(nil), +3240}, }, hd = { Date: Fri, 30 Jan 2009 14:22:59 GMT, Server: Apache, X-SF-Stats: stf-img7, Last-Modified: Thu, Feb 1 19:54:22 localhost varnishd[3146]: Child cleanup complete Please let me know if you nedd additional informations. Cheers, Sascha ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: flushing cache doesn't seem to make varnish know about less
Am Montag 12 Januar 2009 05:35:16 schrieb Timothy Ball: a programming error caused varnish to think there were billions of pages it had to know about. bug is quashed but varnish doesn't seem to know # this is a line from top 3596 0.2 28.3g 28g 3808 256 57m 1916 120 S 20 00 varnishd # this is a line from ps auxwww nobody3596 0.1 0.1 29681936 3968 ? Sl 04:24 0:00 /opt/varnish/sbin/varnishd -a 10.13.37.1:80 -h classic -f /etc/varnish/default.vcl -T 127.0.0.1:6082 -t 120 -w 1,500,20 i tried doing url.purge .* and url.purge . to no avail. how do i make this thing forget? --timball why not just restarting? it's so quick that it shouldn't hurt your service very much. Cheers, Sascha ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
uptime not available to nagios plugin?
Hi, bug or feature? The nagios plugin seems not to be able to extract the uptime: img-proxy1:~# /usr/local/libexec/check_varnish -p uptime Unknown parameter 'uptime' VARNISH UNKNOWN: (null) (3)|uptime=3 but it probably should, shouldn't it? img-proxy1:~# varnishstat -l 21|grep uptime uptime Child uptime Happens both with trunk and 2.02. Thanks and a good party tonight to everyone, Sascha ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
next release?
Hi folks, I'm curious when the next stable (minor) release is planned. I'm especially interested in Date: 2008-11-14 01:19:33 +0100 (Fri, 14 Nov 2008) New Revision: 3390 Modified: trunk/varnish-cache/lib/libvarnish/binary_heap.c Log: Rework the binary heap, we use for expiry processing, to deal more gracefully with large number of objects. which seem to be in the trunk, but didn't made it into 2.0.2. If no release is planned, which trunk revision could be recommended for productive environment? Thanks, Sascha ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
varnish 2.0.1 truncates pre-created cache file?
Hi, bug or feature? With the new version, my cache file gets shrunk. I made it with dd with a size of ~259 GB, but after starting varnishd, only about 119 GB are left. I don't give a size parameter when starting, as in -s file,/var/cache/varnish/store.bin which worked as expected with older releases. Thanks, Sascha ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: varnish 2.0.1 truncates pre-created cache file?
Am Montag 20 Oktober 2008 16:59:37 schrieb Sascha Ottolski: Hi, bug or feature? With the new version, my cache file gets shrunk. I made it with dd with a size of ~259 GB, but after starting varnishd, only about 119 GB are left. I don't give a size parameter when starting, as in -s file,/var/cache/varnish/store.bin which worked as expected with older releases. weird, after another reboot/restart, the cache file now has 189 GB ?! Cheers, Sascha ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
on linux: only one cpu in use?
on a 4 core machine, running varnish exclusively, all the I/O goes to one core only: # cat /proc/interrupts CPU0 CPU1 CPU2 CPU3 0: 603869381 0 0 0IO-APIC-edge timer 6: 3 0 0 0IO-APIC-edge floppy 8: 0 0 0 0IO-APIC-edge rtc 9: 0 0 0 0 IO-APIC-level acpi 14: 63 0 0 0IO-APIC-edge ide0 50: 0 0 0 0 IO-APIC-level libata 58: 451845938 0 0 0 IO-APIC-level aacraid 66: 1457978435 0 0 0 PCI-MSI-X eth0 74: 1082456599 0 0 0 PCI-MSI-X eth0 82: 242944858 0 0 0 PCI-MSI-X eth0 217: 0 0 0 0 IO-APIC-level ohci_hcd:usb1, libata 225:131 0 0 0 IO-APIC-level ehci_hcd:usb2 233: 0 0 0 0 IO-APIC-level libata NMI: 75579 30102 9582 7646 LOC: 603897625 603897604 603897576 603897556 ERR: 0 MIS: 0 it's debian etch with the stock 2.6.18 kernel. performance is ok, beside the fact that the load climbs up when the cache is about 40-50 % full (with a 517GB cache file); below 40 % the load stays below 3, but after that level it rises to over 7 and even 10 at peak times. the response times are still good, though. we just installed irqbalance to see if this makes a difference. after some minutes (without restarting varnish), the picture according to top to top is, that 3 out of 4 core do most of the I/O: Tasks: 89 total, 1 running, 88 sleeping, 0 stopped, 0 zombie Cpu0 : 0.8%us, 0.8%sy, 0.0%ni, 36.7%id, 60.8%wa, 0.8%hi, 0.0%si, 0.0%st Cpu1 : 0.8%us, 1.7%sy, 0.0%ni, 0.0%id, 96.6%wa, 0.0%hi, 0.8%si, 0.0%st Cpu2 : 0.0%us, 0.0%sy, 0.0%ni, 97.5%id, 0.8%wa, 0.0%hi, 1.7%si, 0.0%st Cpu3 : 0.0%us, 0.8%sy, 0.0%ni, 4.2%id, 95.0%wa, 0.0%hi, 0.0%si, 0.0%st I'm posting mostly out of curiosity if this is an expected behavior. Thanks, Sascha ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: want to allow or deny an URL based on a timestamp
Am Mittwoch 13 August 2008 00:02:07 schrieb Darryl Dixon - Winterhouse Consulting: 4) Object expires and Varnish goes to fetch it from the backend, which of course returns 404 or whatever as the URL has expired. Darryl, thanks for your reply. This would of course be a straight forward approach, but in my case, it's vital that the objects live almost forever in the cache. Cheers, Sascha ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Strategy for large cache sets
Am Dienstag 01 Juli 2008 20:16:15 schrieb Skye Poier Nott: I want to deploy Varnish with very large cache sizes (200GB or more) for large, long lived file sets. Is it more efficient to use large swap or large mmap in this scenario? According to the FreeBSD lists, even 20GB of swap requires 200MB of kern.maxswzone just to keep track of it, so it doesn't seem like that will scale too well. Is one or the other method better for many small files vs less many big files? Thanks again... when I'm a Varnish expert I'll help the newbs :) Skye I'm administrating three varnish instances for static images; cache file is setup as 517 GB (pre allocated with dd); currently, after 35 days uptime, it's filled more than half: 305543155712 bytes allocated 248759787520 bytes free It's running on Debian etch, 2x dual-core amd opteron, 32 GB RAM, 30 GB swap: Mem: 32969244k total, 32870944k used,98300k free, 120672k buffers Swap: 29045480k total, 6096612k used, 22948868k free, 25752752k cached the varnish process looks like this in top: PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ P COMMAND 3221 username 15 0 522g 27g 21g S1 87.0 539:33.89 2 varnishd hitrate is 98% for recent short term, overall average is 95%. load of the machines is between 1 and 2 on medium traffic times, and goes up to 3 only seldom. but, the request rate is may be relatively low, compared to what others reported on the list. I don't have hard numbers unfortunately, average is 80.49 according to varnishstat. At peaks the reate may be in the order of may be 400 req/sec for a single instance. We had issues with 1.1.2 crashing, but since running on trunk (r2640), everything runs smooth. response time according to a nagios http response check is between 0.5 - 1 seconds, almost never over 1 second, even at peak times. Hope it's useful for someone, let me know if you need more details. Cheers, Sascha ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Performance options for trunk
Am Freitag 30 Mai 2008 14:01:35 schrieb Audun Ytterdal: I run trunk in front of a site. I have 3 varnishservers, all with 32GB of ram serving only small pictures, thumbnails and profile pictures. The cacheset is pretty large (1.5 TB) and changing much over time. And before you all ask why I don't just server partitioned data from several apache/nginx/lighttpd servers It's because we're not there yet. The varnishes all fetch their content from one lighttpd server . I run into the Thread pileup-problem I've set threadlimit to 1500 and it usually lies between 80 and 700. While restarting it hits the 1500 limit and stays there for a few minutes. Then it gradualy manages to controll traffic and ends up around 80 threads. It usually grows a bit. But not over 700-1000 ish. But suddenly, under high traffic it goes up to the limit beeing 1500 or 4000 or whatever i set it to. Then it stays there and usualy never recovers without a restart. I guess it's because the backend at some point answers slowly. But is there a way to easier get out of this situation. Running varnish like this: (redhat 4.6 32 GB RAM) /usr/sbin/varnishd -a :80 -f /etc/varnish/nettby.vcl -T 127.0.0.1:82 -t 120 -w 2,2000,30 -u varnish -g varnish -p client_http11 on -p thread_pools 4 -p thread_pool_max 4000 -p listen_depth 4096 -p lru_interval 3600 -h classic,59 -s file,/var/varnish/varnish_storage.bin,30G -P /var/run/varnish.pid and for testing purposes (redhat 5.1 32 GB RAM) varnish 22959 17.1 45.2 20187068 14938024 ? Sl May29 160:40 /usr/sbin/varnishd -a :80 -f /etc/varnish/nettby.vcl -T 127.0.0.1:82 -t 120 -u varnish -g varnish -p thread_pools 4 -p thread_pool_max 2000 -p client_http11 on -p listen_depth 4096 -p lru_interval 3600 -h classic,59 -s malloc,60G -P /var/run/varnish.pid Each varnish handles about 3000 req/s before it caves in. Any suggestions? not really sure if I can help, but at least I can tell you that we run a similar setup. however, our images are never changed, only get deleted, and if it happens, a PURGE request clears them off the proxies. Therefore, we go a little different route: huge file based cache (over 500 GB), and huge default_ttl (one year, only 404 errors have a smaller ttl of 3 hours). we also have 32 GB installed, and set thread_pool_max to 8000, and -h classic,259 according to a hint in the wiki. I did not look at the thread count for a longer time, and don't really know if 8000 is any good or bad. since everything works nice, I just keep it as it is. having said all this, the request pattern is not really like yours. the cache file is only used by ~60 % after 50 days, and peak traffic for each out of three proxies is only about 300 req/s. one would probably cope well with our traffic, we run three as a simple HA measure. Cheers, Sascha Are the parameters sane? -- Audun * Denne fotnoten bekrefter at denne e-postmeldingen ble skannet av MailSweeper og funnet fri for virus. * This footnote confirms that this email message has been swept by MailSweeper for the presence of computer viruses. * ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: please help on compiling nagios plugin
Am Donnerstag 29 Mai 2008 17:06:41 schrieb Sascha Ottolski: Hi, asked it some weeks ago, but got no answer :-( when running configure, I see this error: ./configure: line 19292: syntax error near unexpected token `VARNISHAPI,' ./configure: line 19292: `PKG_CHECK_MODULES(VARNISHAPI, varnishapi)' the line is # Checks for libraries. PKG_CHECK_MODULES(VARNISHAPI, varnishapi) finally managed to compile the nagios plugin that comes with the svn checkout, may be this helps someone else. apparently, the debian packages I created and installed before using the debian/ directory that comes with the checkout does not install the pkg-config file, and is missing a link. apt-get install pkg-config cd /root/varnish-svn-trunk/varnish-tools/nagios/ cp /root/varnish-svn-trunk/varnish-cache/varnishapi.pc /usr/lib/pkgconfig/ cd /usr/lib/ ln -s libvarnishapi.so.0 libvarnishapi.so ./autogen.sh ./configure make make install this will create the plugin as /usr/local/libexec/check_varnish Cheers, Sascha ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: make varnish still respond if backend dead
Am Mittwoch 23 April 2008 19:34:14 schrieb Max Clark: Did you find a solution to this? not really, I hope to have found a workaround by adding this to my vcl: sub vcl_fetch { remove obj.http.X-Varnish-Host; setobj.http.X-Varnish-Host = myhostname; if (obj.status == 404) { set obj.ttl = 7200s; } } so 404 still may happen, but are cached shorter than my default. Cheers, Sascha On Fri, Apr 4, 2008 at 12:51 AM, Sascha Ottolski [EMAIL PROTECTED] wrote: Hi, sorry if this is FAQ: what can I do to make varnish respond to request if it's backend is dead. should return cache hits, of course, and a proxy error or something for a miss. and how can I prevent varnish to cache 404 for objects it couldn't fetch due to a dead backend? at least I think that is what happened, as varnish reported 404 for URLs that definetely exist; the dead backend seems to be the only logical explanation why varnish could think it's not. oh, and is there a way to put the local hostname in a header? I have two proxies, load balanced by LVS, so using server.ip reports the same IP on both nodes. Thanks, Sascha ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: mass purge causes high load?
Am Montag 14 April 2008 14:19:11 schrieb Dag-Erling Smørgrav: Sascha Ottolski [EMAIL PROTECTED] writes: Dag-Erling Smørgrav [EMAIL PROTECTED] writes: No, the semantics are completely different. With HTTP PURGE, you do a direct cache lookup, and set the object's TTL to 0 if it exists. With url.purge, you add an entry to a ban list, and every time an object is looked up in the cache, it is checked against all ban list entries that have arrived since the last time. This is the only way to implement regexp purging efficiently. thanks very much for clarification. I guess the ban list gets smaller everytime an object has been purged? Each ban list entry has a sequence number, and each object has a generation number. When a new object is inserted into the cache, its generation number is set to the sequence number of the newest ban list entry. For every cache hit, the object's generation number is compared to the sequence number of the last ban list entry. If they don't match, the object is checked against every ban list entry that has a sequence number higher than the object's generation number. If the object matches one of these entries, it is discarded, and processing continues as if the object had never been in cache. If it doesn't, its generation number is set to the sequence number of the last entry it was matched against. The only alternative to this algorithm would be to lock the cache and inspect every item, which would stop all request processing for several seconds or minutes, depending on the size of your cache and how much of it is resident; and even then, it would only work for hash.purge, not url.purge, as only the hash string is actually stored in the cache. DES Dag, thanks again. If I get it right, the ban list never shrinks, so I probably have 17,000 ban list entries hanging around. can I purge this list somehow, other than restarting the proxy? I suppose even if the list is not used any more, even the comparing the generation and sequence no. for each request adds a bit of overhead, doesn't it? Cheers, Sascha ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
mass purge causes high load?
Hi, I just needed to get rid of about 27,000 stale URLs, that were cached as 404 or 302 due to a configuration error on the backends. So I did a url.purge in a loop, sleeping 0.1 seconds after each URL: for i in `cat notfound.txt.sorted` ; do varnishadm -T:81 url.purge $i; sleep 0.1; done However, after about half of it I needed to stop, cause the varnish servers have a high load (about 30, dropping only very slowly), and the response time is bad (more or less 20 seconds per request :-() May the purge be the cause? I stopped the purge 45 min. ago, and still the high load and slow responses. Any way to see what is going on inside? sometimes the load even goes up, I see a 50 now :-( The traffic that comes in is normal for this time of day, and usually the load stays below 3, response time way under a second. I don't like the idea of performing a restart, don't want to lose the cache... Thanks a lot, Sascha ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
recommendation for swap space?
Hi, now that my varnish processes start to reach the RAM size, I'm wondering what a dimension of swap would be wise? I currently have about 30 GB swap space for 32 GB RAM, but am wondering if it could even make sense to have no swap at all? My cache file is 517 GB in size. BTW, the trunk seems to run stable, for 2,5 half days now! Cheers, Sascha ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: cache empties itself?
Am Freitag 04 April 2008 01:32:28 schrieb DHF: Sascha Ottolski wrote: however, my main problem is currently that the varnish childs keep restarting, and that this empties the cache, which effectively renders the whole setup useless for me :-( if the cache has filled up, it works great, if it restarts empty, obviously it doesn't. is there anything I can do to prevent such restarts? Varnish doesn't just restart on its own. Check to make sure you aren't sending a kill signal if you are running logrotate through a cronjob. I'm not sure if a HUP will empty the cache or not. --Dave I definetely did nothing like this, I've observed restarts out of the blue. I'm no giving the trunk a try, hopefully there's an improvement to that matter. what I did once in a while is to vcl.load, vcl.use. will this force a restart of the child, thus flushing the cache? Thanks again, Sascha ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: cache empties itself?
Am Freitag 04 April 2008 04:37:44 schrieb Ricardo Newbery: sub vcl_fetch { if (obj.ttl 120s) { set obj.ttl = 120s; } } Or you can invent your own header... let's call it X-Varnish-1day sub vcl_fetch { if (obj.http.X-Varnish-1day) { set obj.ttl = 86400s; } } so it seems like I'm on the right track, thanks for clarifying. now, is the ttl a information local to varnish, or will it set headers also (if I look into the headers of my varnishs' responses, it doesn't appear so)? what really confuses me: the man pages state a little different semantics for default_ttl. in man varnishd: -t ttl Specifies a hard minimum time to live for cached documents. This is a shortcut for specifying the default_ttl run-time parameter. default_ttl The default time-to-live assigned to objects if neither the backend nor the configuration assign one. Note that changes to this parameter are not applied retroac‐ tively. The default is 120 seconds. hard minimum sounds to me as if it would overwrite any setting the backend has given. however, in man vcl it's explained, that default_ttl does only affect documents without backend given TTL: The following snippet demonstrates how to force a minimum TTL for all documents. Note that this is not the same as setting the default_ttl run-time parameter, as that only affects doc‐ ument for which the backend did not specify a TTL. sub vcl_fetch { if (obj.ttl 120s) { set obj.ttl = 120s; } } the examples have a unit (s) appended, as in the example of the man page, that suggests that I could also append things like m, h, d (for minutes, hours, days)? BTW, in the trunk version, the examples for a backend definition have still the old syntax. backend www { set backend.host = www.example.com; set backend.port = 80; } instead backend www { .host = www.example.com; .port = 80; } Thanks a lot, Sascha ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
make varnish still respond if backend dead
Hi, sorry if this is FAQ: what can I do to make varnish respond to request if it's backend is dead. should return cache hits, of course, and a proxy error or something for a miss. and how can I prevent varnish to cache 404 for objects it couldn't fetch due to a dead backend? at least I think that is what happened, as varnish reported 404 for URLs that definetely exist; the dead backend seems to be the only logical explanation why varnish could think it's not. oh, and is there a way to put the local hostname in a header? I have two proxies, load balanced by LVS, so using server.ip reports the same IP on both nodes. Thanks, Sascha ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: cache empties itself?
Am Freitag 04 April 2008 10:11:52 schrieb Stig Sandbeck Mathisen: On Fri, 4 Apr 2008 09:01:57 +0200, Sascha Ottolski [EMAIL PROTECTED] said: I definetely did nothing like this, I've observed restarts out of the blue. I'm no giving the trunk a try, hopefully there's an improvement to that matter. If the varnish caching process dies for some reason, the parent varnish process will start a new one to keep the service running. This new one will not re-use the cache of the previous. With all that said, the varnish caching process should not die in this way, that is undesirable behaviour. If you'd like to help debugging this issue, take a look at http://varnish.projects.linpro.no/wiki/DebuggingVarnish I already started my proxies with the latest trunk and corecumps enabled, and cross my fingers. so far it's running for about 11 hours... BTW, if I have 32 GB of RAM, and 517 GB of cache file, how large will the core dump be? Note that if you run a released version, your issue may have beeen fixed already in a later release, the related branch, or in trunk, what I did once in a while is to vcl.load, vcl.use. will this force a restart of the child, thus flushing the cache? No. The reason Varnish has vcl.load and vcl.use is to make sure you don't have to restart anything, thus losing your cached data. excellent. the numbers that are shown next to the config in vcl.list, is it the number of connections that (still) use it? Thanks again, Sascha ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
unable to compile nagios module from trunk
after checking out and running autogen.sh, configure stops with this error: ./configure: line 19308: syntax error near unexpected token `VARNISHAPI,' ./configure: line 19308: `PKG_CHECK_MODULES(VARNISHAPI, varnishapi)' Cheers, Sascha ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: cache empties itself?
Am Freitag 04 April 2008 18:11:23 schrieb Michael S. Fischer: On Fri, Apr 4, 2008 at 3:20 AM, Sascha Ottolski [EMAIL PROTECTED] wrote: you are right, _if_ the working set is small. in my case, we're talking 20+ mio. small images (5-50 KB each), 400+ GB in total size, and it's growing every day. access is very random, but there still is a good amount of hot objects. and to be ready for a larger set it cannot reside on the webserver, but lives on a central storage. access performance to the (network) storage is relatively slow, and our experiences with mod_cache from apache were bad, that's why I started testing varnish. Ah, I see. The problem is that you're basically trying to compensate for a congenital defect in your design: the network storage (I assume NFS) backend. NFS read requests are not cacheable by the kernel because another client may have altered the file since the last read took place. If your working set is as large as you say it is, eventually you will end up with a low cache hit ratio on your Varnish server(s) and you'll be back to square one again. The way to fix this problem in the long term is to split your file library into shards and put them on local storage. Didn't we discuss this a couple of weeks ago? exactly :-) what can I see, I did analyze the logfiles, and learned that despite the fact that a lot of the access are truly random, there is still a good amount of the request concentrated to a smaller set of the images. of course, the set is changing over time, but thats what a cache can handle perfectly. and my experiences seem to prove my theory: if varnish keeps running like it is now for about 18 hours *knock on wood*, the cache hit rate is close to 80 %! and that takes so much pressure from the backend that the overall performance is just awesome. putting the files on local storage just doesn't scales well. I'm more thinking about splitting the proxies like discussed on the list before: a loadbalancer could distribute the URLs in a way that each cache holds it's own share of the objects. Cheers, Sascha ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: cache empties itself?
Am Donnerstag 03 April 2008 18:07:53 schrieb DHF: Sascha Ottolski wrote: how can this be? My varnish runs for about 36 hours now. yesterday evening, the resident memory size was like 10 GB, which is still way below the available 32. later that evening, I stopped letting request to the proxy over night. now I came back, let the request back in, and am wondering that I see a low cacht hit rate. looking a bit closer it appears as if the cache got smaller over night, now the process only consumes less than 1 GB of resident memory, which fits the reported bytes allocated in the stats. can I somehow find out why my cached objects were expired? I have a varnishlog -w running all the time, the the information might there. but, what to look for, and even more important, how can I prevent that expiration? I started the daemon with -p default_ttl=31104000 to make it cache very aggresively... There could be a lot of factors, is apache setting a max-age on the items? As it says in the man page: default_ttl The default time-to-live assigned to objects if neither the backend nor the configuration assign one. Note that changes to this param- eter are not applied retroactively. Is this running on a test machine in a lab where you can control the requests this box gets? If so you should run some tests to make sure that you really are caching objects. Run wireshark on the apache server listening on port 80, and using curl send two requests for the same object, and make sure that only one request hits the apache box. If thats working like you expect, and the Age header is incrementing, then you need to run some tests using a typical workload that your apache server expects to see. Are you setting cookies on this site? I think what is happening is that you are setting a max-age on objects from apache ( which you can verify using curl, netcat, telnet, whatever you like ), and varnish is honoring that setting and expiring items as instructed. I'm not awesome with varnishtop and varnishlog yet, so I'm probably not the one to ask about getting those to show you an objects attributes, anyone care to assist on that front? --Dave dave, thanks a lot, I may have confused it with varnishd -t, which doesn't seem to be the same as -p default_ttl? hmm, but then again, the manual says it's a shortcut, but the semantic sound different then the above: -t ttl Specifies a hard minimum time to live for cached documents. This is a shortcut for specifying the default_ttl run-time parameter. Cheers, Sascha ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: cache empties itself?
Am Donnerstag 03 April 2008 18:07:53 schrieb DHF: how can this be? My varnish runs for about 36 hours now. yesterday evening, the resident memory size was like 10 GB, which is still way below the available 32. later that evening, I stopped letting request to the proxy over night. now I came back, let the request back in, and am wondering that I see a low cacht hit rate. looking a bit closer it appears as if the cache got smaller over night, now the process only consumes less than 1 GB of resident memory, which fits the reported bytes allocated in the stats. now I just had to learn the hard way... 1. if I stop and start the child via the management port, the cache is empty afterwards I did this, because I couldn't manage to change the config, at least to me it looked that way: vcl.list 200 52 0 boot 44085 default *7735 default2 vcl.discard default 106 37 No configuration named default known. 200 0 vcl.list 200 52 0 boot 44085 default *7750 default2 vcl.show default 106 37 No configuration named default known. 2. but even worse, I've just seen that a new child was born somehow, which also emptied the cache is this some regular behaviour, or was there a crash? All this with 1.1.2. It's vital to my setup to cache as many objects as possible, for a long time, and that they really stay in the cache. Is there anything I could do to prevent the cache being emptied? May be I've been bitten by a bug and should give the trunk a shot? BTW, after starting to play around with varnish, I'm really impressed, it's a bit frustrating sometimes to understand everything, but the outcome is very impreesing. Thanks for such a nice piece of software! Thanks a lot, Sascha ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: cache empties itself?
Am Donnerstag 03 April 2008 19:30:25 schrieb Michael S. Fischer: On Thu, Apr 3, 2008 at 10:26 AM, Sascha Ottolski [EMAIL PROTECTED] wrote: All this with 1.1.2. It's vital to my setup to cache as many objects as possible, for a long time, and that they really stay in the cache. Is there anything I could do to prevent the cache being emptied? May be I've been bitten by a bug and should give the trunk a shot? Just set the Expires: headers on the origin (backend) server responses to now + 10 years or something. --Michael thanks for the hint. unfortunately, not quite what I want. I want varnish to keep the objects very long, so that it not has to ask the backend to often. therefore, it's important that the cache keeps growing, instead of vanishing once in a while. and I don't wan't upstream caches or browsers to cache that long, only varnish, so setting headers doesn't seem to fit. Cheers, Sascha ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: the most basic config
Am Dienstag 01 April 2008 21:15:11 schrieb Sascha Ottolski: thanks very much, this was very helpful. now, could anyone give me a hint how to interpret the output of varnishhist? I'm seeing client_req 15054 123.39 Client requests received cache_hit 1632 13.38 Cache hits cache_hitpass 0 0.00 Cache hits for pass cache_miss 1024 8.39 Cache misses backend_req 12347 101.20 Backend requests made ok, I'm getting a bit futher, without claiming that I understand it fully. I just took the example of the wiki backend default { set backend.host = 192.168.1.2; set backend.port = http; } sub vcl_recv { if (req.request == GET req.url ~ \.(jpg|jpeg|gif|png)$) { lookup; } if (req.request == HEAD req.url ~ \.(jpg|jpeg|gif|png)$) { lookup; } } and now I'm seeing as many misses as backend request. I guess now it's really caching most of what I want. now, could someone help me interpreting the hitrate ratio and avg? Hitrate ratio: 10 100 360 Hitrate avg: 0.3366 0.3837 0.4636 Thanks, Sascha (very impressed so far :-)) ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
the most basic config
Hi, I'm a bit puzzled by the examples and the explanation of the default vcl config presented in the man page. Now I'm wondering, if I want to make my first steps for creating a reverse proxy for static images only, that basically should cache everything indefinetely (as long as cache space is available), what would be the minimum config I need to have? Of course I need to define a backend, may be increase the TTL for objects. A pointer to some kind of beginners guide would be nice, if such a thing exists. Thanks a lot, Sascha ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: the most basic config
Am Dienstag 01 April 2008 17:42:17 schrieb DHF: Sascha Ottolski wrote: Hi, I'm a bit puzzled by the examples and the explanation of the default vcl config presented in the man page. Now I'm wondering, if I want to make my first steps for creating a reverse proxy for static images only, that basically should cache everything indefinetely (as long as cache space is available), what would be the minimum config I need to have? Of course I need to define a backend, may be increase the TTL for objects. A pointer to some kind of beginners guide would be nice, if such a thing exists. I don't think there really is a step by step beginners guide to varnish, though one nice thing is out of the box it works with very few changes. If you used the rpm that is available on sourceforge, all you need is the following in /etc/sysconfig/varnish: DAEMON_OPTS=-a :80 \ -T localhost:82 \ -b localhost:81 \ -u varnish -g varnish \ -s file,/var/lib/varnish/varnish_storage.bin,1G I have this on a test machine in my lab currently and it is happily wailing out cached bits at an unbelievable rate. Apache is running on localhost:81, and it is setting the desired age for objects, the storage file size above is the default. This should get you up and running, and you can start tuning from there. One thing that seems to be daunting at first is that varnish is extremely flexible, and because of that there are many examples of neat configuration tricks and snippets of wizardry sprinkled in the sparse documentation, which makes it seem that these things are necessary to make varnish work, but this is not the case, the default settings work quite well. --Dave thanks very much, this was very helpful. now, could anyone give me a hint how to interpret the output of varnishhist? I'm seeing client_req 15054 123.39 Client requests received cache_hit163213.38 Cache hits cache_hitpass 0 0.00 Cache hits for pass cache_miss 1024 8.39 Cache misses backend_req 12347 101.20 Backend requests made I'm wondering, why is there such a big difference between cache_miss and client_req. I would expect both to be similar, shouldn't they? as I said, my setup is very simply, only static images to be be proxied. I started varnish (1.1.2) with # cat run_varnish.sh ulimit -n 131072 varnishd \ -u sfapp \ -g sfapp \ -a :80 \ -T :81 \ -b 192.168.1.11 \ -p thread_pool_max=2000 \ -p default_ttl=31104000 \ -h classic,259 \ -s file,/var/cache/varnish/store.bin BTW, in the wiki, under Performance, is a hint to increase lru_interval, but apparently this parameter isn't known to varnishd any more. Thanks, Sascha ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Miscellaneous questions
Am Dienstag 18 März 2008 00:07:59 schrieb Michael S. Fischer: On Mon, Mar 17, 2008 at 3:32 PM, DHF [EMAIL PROTECTED] wrote: This is called CARP/Cache Array Routing Protocol in squid land. Here's a link to some info on it: http://docs.huihoo.com/gnu_linux/squid/html/x2398.html It works quite well for reducing the number of globally duplicated objects in an multilayer accelerator setup, as you can add additional machines in the interstitial space between the frontline caches and the origin as a cheap and easy way to increase the overall ram available to hot objects without having to use some front end load balancer like perlbal, big ip or whatever to direct the individual clients to specific frontlines to accomplish the same thing ( though you usually still have a load balancer for fault tolerance ). Though in squid there are some bugs with their implementation ... Thanks for the reminder. I'll file RFEs for both the static and CARP implementations. I presume the static configuration will be done first (if at all), as it's probably significantly easier to implement. probably not exactly the same, but may be someone finds it useful: If just started to dive a bit into HAProxy (http://haproxy.1wt.eu/): the development version has the ability to calculate the loadbalancing based on the hash of the URI to decide which backend should receive a request. I guess this could be a nice companion to put in front of several reverse proxies to increase the hit rate of each one. Cheers, Sascha ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
production ready devel snapshot?
Hi, probably a stupid question, but if I'd like to use more recent features like the load-balancer, and since the latest official release is a bit dated, is there anything like a snapshot release that is worth giving it a try, especially if my configuration will hopefully stay simple for a while? Thanks, Sascha ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: how to...accelarate randon access to millions of images?
Am Sonntag 16 März 2008 15:54:42 schrieben Sie: Sascha Ottolski [EMAIL PROTECTED] writes: Now my question is: what kind of hardware would I need? Lots of RAM seems to be obvious, what ever a lot may be...What about the disk subsystem? Should I look into something like RAID-0 with many disk to push the IO-performance? First things first: instead of a few large disks, you want lots of small fast ones - 36 GB or 72 GB 10,000 RPM disks - to maximize bandwidth. There are two ways you might condfigure your storage: one is to place all the disks in a RAID-0 array and use a single file system and storage file on top of that. The alternative is to have a separate file system and storage file on each disk. I honestly don't know which will be the fastest; if you have a chance to run some benchmarks, I'd love to see your numbers and your conclusion. Dag, thanks for your hints. Just curious, how would i tell the varnish-process that I have several filesystem to put the cache on? I had the feeling that you give exactly one directory as option. Cheers, Sascha Note that even if a single RAID-0 array turns out to be the fastest option, you may have to compromise and split your disks into two arrays, unless you find a RAID controller that can handle the number of disks you need. DES ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: how to...accelarate randon access to millions of images?
Michael, thanks a lot for taking the time to give me such a detailed answer. please see my replies below. Am Sonntag 16 März 2008 18:00:42 schrieb Michael S. Fischer: On Fri, Mar 14, 2008 at 1:37 PM, Sascha Ottolski [EMAIL PROTECTED] wrote: The challenge is to server 20+ million image files, I guess with up to 1500 req/sec at peak. A modern disk drive can service 100 random IOPS (@ 10ms/seek, that's reasonable). Without any caching, you'd need 15 disks to service your peak load, with a bit over 10ms I/O latency (seek + read). The files tend to be small, most of them in a range of 5-50 k. Currently the image store is about 400 GB in size (and growing every day). The access pattern is very random, so it will be very unlikely that any size of RAM will be big enough... Are you saying that the hit ratio is likely to be zero? If so, consider whether you want to have caching turned on the first place. There's little sense buying extra RAM if it's useless to you. well, wo far I have analyzed the webserver logs of one week. this indicates that indeed there would be at least some cache hits. we have about 20 mio. images on our storage, and in one week about 3.5 mio images were repeadetly requested. to be more precise: 272,517,167 requests made to a total of 7,489,059 different URLs 3,226,150 URLs were requested at least 10 times, accounting for 257,306,351 repeated request so, if I made my analysis not to lousy, I guess there is quite a opportunity that a cache will help. roughly, the currently 20 mio images use 400 GB of storage; so 3.5 mio images may account for 17.5% of 400 GB ~70 GB. well, but 70 GB RAM is still a lot. but may be a mix of enough RAM and fast disks may be the way to go; may be in addition to a content based load-balancing to several caches (say, one for thumbnails, one for larger size images). currently, at peak times we only serve about 350 images/sec, due to the bottleneck of the storage backend. so the targed of 1500 req/sec may be bit of wishful thinking, as I don't know what the real peak would look like without the bottleneck; may very well be more like 500-1000 req/sec; but of course I'd like to leave room for growth :-) Thanks a lot, Sascha Now my question is: what kind of hardware would I need? Lots of RAM seems to be obvious, what ever a lot may be...What about the disk subsystem? Should I look into something like RAID-0 with many disk to push the IO-performance? You didn't say what your failure tolerance requirements were. Do you care if you lose data? Do you care if you're unable to serve some requests while a machine is down? well, it's a cache, after all. the real image store is in place and high available and backed up and all the like. but, the webservers can't get the images fast enough of the storage. we just enabled apache's mod_cache, which seems to help a bit, but I suspect a dedicated tool like varnish could perform better (plus, you don't get any runtime information to learn how efficient the apache cache is). Consider dividing up your image store onto multiple machines. Not only would you get better performance, but you would be able to survive hardware failures with fewer catastropic effects (i.e., you'd lose only 1/n of service). If I were designing such a service, my choices would be: (1) 4 machines, each with 4-disk RAID 1 (fast, but dangerous) (2) 4 machines, each with 5-disk RAID 5 (safe, fast reads, but slow writes for your file size - also, RAID 5 should be battery backed, which adds cost) (3) 4 machines, each with 4-disk RAID 10 (will meet workload requirement, but won't handle peak load in degraded mode) (4) 5 machines, each with 4-disk RAID 10 (5) 9 machines, each with 2-disk RAID 0 Multiply each of these machine counts by 2 if you want to be resilient to failures other than disk failures. You can then put a Varnish proxy layer in front of your image storage servers, and direct incoming requests to the appropriate backend server. --Michael ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
how to...accelarate randon access to millions of images?
Hi, I'm relatively new to varnish (I'm having an eye on it since it appeared in public, but so far never really used it). Now the time may have come to give it whirl. And am wondering if someone could give me a little advise to get me going. The challenge is to server 20+ million image files, I guess with up to 1500 req/sec at peak. The files tend to be small, most of them in a range of 5-50 k. Currently the image store is about 400 GB in size (and growing every day). The access pattern is very random, so it will be very unlikely that any size of RAM will be big enough... Now my question is: what kind of hardware would I need? Lots of RAM seems to be obvious, what ever a lot may be...What about the disk subsystem? Should I look into something like RAID-0 with many disk to push the IO-performance? Thanks in advance, Sascha ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc