Range Requests
Does varnish support range requests? I want each range request of a large file to be treated as a unique web object in varnish, and I'm wondering if its possible. Are the headers used in the hashing, or a better question would be can the range-request header be included in the hash? Would that work or am I missing some pieces that would screw up the plan? Thanks, --DHF ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Security doubt about Varnish and firewall.
andan andan wrote: We have a security doubt: Should we install Varnish inside or outside firewall? I run varnish on a many linux boxes with Netfilter default log and drop rules and have not seen a performance problem. For better performance, we consider that the best choice is outside, but for obvious security reasons, the better is putting it into a DMZ. This depends on your particular environment. What kind of hardware are you using? What kind of firewall is it? How much traffic can the firewall handle? How much traffic do you usually see to the backend server? Where is the backend server located? What is your reason for using a reverse proxy? What is the expected hit ratio on the cache? What kind of content are you delivering? Do you have any network operations tasks that require you to collect data from the server in a fashion that requires it to be behind the firewall? If the backend server is through the firewall, it could be beneficial to have your varnish box outside the firewall and you could restrict access to the backend server to only the varnish servers ip or an internal ip on a seperate network. Then run iptables or ipfw on the varnish server itself Any suggestions? Somebody has Varnish outside the firewall? I have found no reason to not use ipfw or iptables on deployed servers, the benefit in my opinion out weighs the performance loss. With a minimal ruleset the performance impact is so small its hard to measure until you reach huge packets per second, or connections a second ( assuming your hardware isn't a few years away from collecting a pension ). I have never seen a production box reach the limits of iptables packets per second because whatever process is on the box ( apache, varnish, squid, mysql, etc ) will have long ago melted down into a pile of smoldering ruin, due to high load and iptables performance becomes irrelevant. --Dave ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish crashing when system starts to swap
Calle Korjus wrote: This is our startup command: /opt/varnish/sbin/varnishd -a :80 -p lru_interval 3600 -f /opt/varnish/conf/default.vcl -T 127.0.0.1:6082 -t 3600 -w 128,1000,60 -u varnish -g varnish -s file,/srv/varnish/varnish_storage.bin,30G -P /var/run/varnish.pid Varnish looks fine until it's had abour 1,5 million requests, then we can see the kswapd0 and kswapd1 start working and load average rises to about 200 and the machine gets totally unresponsive. Top shows a lot of cpu beeing spent on i/o waits and varnish child process restarts sometimes. In best case the process restarts and the server starts behaving within 5 minutes but sometimes varnish dies completely. One thing we have noticed is that the reserved memory for varnish keeps rising and when it crashes it is usually around 14G. I would try lowering the storage file size to within your total system ram, subtracting some memory for buffers and cache and apache. See if it still spirals into swap hell. You could also try setting rlimits for the varnish user, though I don't know if settings in /etc/security/limits.conf apply to privilege dropped processes. The varnish storage file is running on the same physical disk as the system and the swap, could that be the problem? Should varnish really allocate so much memory so that the system starts to swap to disk? I think what is happening is that your hit ratio is low and your storage size is quite large, so varnish has enough objects marked as hot that its trying to hold them all in memory? I don't know for sure, I could be way off. I think if you restrict the storage size there will be increased disk activity as you churn the cache, but you won't be churning swap space as well, and you shouldn't exhaust the virtual memory of the system. I'd have to test that though. --Dave ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: regsub, string concatenation?
Jon Drukman wrote: i'm trying to rewrite all incoming URLs to include the http host header as part of the destination url. example: incoming: http://site1.com/someurl rewritten: http://originserver.com/site/site1.com/someurl incoming: http://site2.com/otherurl rewritten: http://originserver.com/site/site2.com/someurl the originserver is parsing the original hostname out of the requested url. works great with one hardcoded host: set req.url = regsub(req.url, ^, /site/site1.com); i can't get it to use the submitted http host though... set req.url = regsub(req.url, ^, /site/ + req.http.host); varnish complains about the plus sign. is there some way to do this kind of string concatenation in the replacement? Try this: set req.url = /site/ req.http.host / req.url; The extra / by itself might not be necessary. Set will allow you to concatenate strings but I'm not sure the regsub will. I think this will provide you what you are looking for, let me know. --Dave ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: cache empties itself?
Ricardo Newbery wrote: On Apr 7, 2008, at 10:30 PM, DHF wrote: Ricardo Newbery wrote: On Apr 7, 2008, at 5:22 PM, Michael S. Fischer wrote: Sure, but this is also the sort of content that can be cached back upstream using ordinary HTTP headers. No, it cannot. Again, the use case is dynamically-generated content that is subject to change at unpredictable intervals but which is otherwise fairly static for some length of time, and where serving stale content after a change is unacceptable. Ordinary HTTP headers just don't solve that use case without unnecessary loading of the backend. Isn't this what if-modified-since requests are for? 304 not modified is a pretty small request/response, though I can understand the tendency to want to push it out to the frontend caches. I would think the management overhead of maintaining two seperate expirations wouldn't be worth the extra hassle just to save yourself some ims requests to a backend. Unless of course varnish doesn't support ims requests in a usable way, I haven't actually tested it myself. Unless things have changed recently, Varnish support for IMS is mixed. Varnish supports IMS for cache hits but not for cache misses unless you tweak the vcl to pass them in vcl_miss. Varnish will not generate an IMS to revalidate it's own cache. Good to know. Also it is not necessarily true that generating a 304 response is always light impact. I'm not sure about the Drupal case, but at least for Plone there can be a significant performance hit even when just calculating the Last-Modified date. The hit is usually lighter than that required for generating the full response but for high-traffic sites, it's still a significant consideration. But the most significant issue is that IMS doesn't help in the slightest to lighten the load of *new* requests to your backend. IMS requests are only helpful if you already have the content in your own browser cache -- or in an intermediate proxy cache server (for proxies that support IMS to revalidate their own cache). The intermediate proxy was the case I was thinking about, but you are correct, if there is no intermediate proxy and varnish frontends don't revalidate with ims requests then the whole plan is screwed. Regarding the potential management overhead... this is not relevant to the question of whether this strategy would increase your site's performance. Management overhead is a separate question, and not an easy one to answer in the general case. The overhead might be a problem for some. But I know in my own case, the overhead required to manage this sort of thing is actually pretty trivial. How do you manage the split ttl's? Do you send a purge after a page has changed or have you crafted another way to force a revalidation of cached objects? --Dave Ric ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: cache empties itself?
Ricardo Newbery wrote: On Apr 7, 2008, at 5:22 PM, Michael S. Fischer wrote: Sure, but this is also the sort of content that can be cached back upstream using ordinary HTTP headers. No, it cannot. Again, the use case is dynamically-generated content that is subject to change at unpredictable intervals but which is otherwise fairly static for some length of time, and where serving stale content after a change is unacceptable. Ordinary HTTP headers just don't solve that use case without unnecessary loading of the backend. Isn't this what if-modified-since requests are for? 304 not modified is a pretty small request/response, though I can understand the tendency to want to push it out to the frontend caches. I would think the management overhead of maintaining two seperate expirations wouldn't be worth the extra hassle just to save yourself some ims requests to a backend. Unless of course varnish doesn't support ims requests in a usable way, I haven't actually tested it myself. --Dave ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: cache empties itself?
Sascha Ottolski wrote: Am Freitag 04 April 2008 18:11:23 schrieb Michael S. Fischer: Ah, I see. The problem is that you're basically trying to compensate for a congenital defect in your design: the network storage (I assume NFS) backend. NFS read requests are not cacheable by the kernel because another client may have altered the file since the last read took place. If your working set is as large as you say it is, eventually you will end up with a low cache hit ratio on your Varnish server(s) and you'll be back to square one again. The way to fix this problem in the long term is to split your file library into shards and put them on local storage. Didn't we discuss this a couple of weeks ago? exactly :-) what can I see, I did analyze the logfiles, and learned that despite the fact that a lot of the access are truly random, there is still a good amount of the request concentrated to a smaller set of the images. of course, the set is changing over time, but thats what a cache can handle perfectly. and my experiences seem to prove my theory: if varnish keeps running like it is now for about 18 hours *knock on wood*, the cache hit rate is close to 80 %! and that takes so much pressure from the backend that the overall performance is just awesome. putting the files on local storage just doesn't scales well. I'm more thinking about splitting the proxies like discussed on the list before: a loadbalancer could distribute the URLs in a way that each cache holds it's own share of the objects. By putting intermediate caches between the file storage and the client, you are essentially just spreading the storage locally between cache boxes, so if this method doesn't scale then you are still in need of a design change, and frankly so am I :) What you need to model is the popularity curve for your content, if your images do not fit with an 80/20 rule of popularity, ie. 20% of your images soak up less than 80% or requests, then you will spend more time thrashing the caches than serving the content, and Michael is right, you would be better served to dedicate web servers with local storage and shard your images across them. If 80% of your content is rarely viewed, then using the same amount of hardware defined as caching accelerators, you will see an increase in throughput due to more hardware serving a smaller number of images. It all depends on your content and users viewing habits. --Dave ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: the most basic config
Sascha Ottolski wrote: now, could someone help me interpreting the hitrate ratio and avg? Hitrate ratio: 10 100 360 Hitrate avg: 0.3366 0.3837 0.4636 Hit rate is the number of hits/number of requests. Hits are requests for objects that are in the cache, Misses are requests that go to the backend, the more misses the lower your Hitrate average. Hitrate ratio is the ratio of hits to misses, I believe. The lower your hitrate average the lower your performance. --Dave ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: cache empties itself?
Sascha Ottolski wrote: how can this be? My varnish runs for about 36 hours now. yesterday evening, the resident memory size was like 10 GB, which is still way below the available 32. later that evening, I stopped letting request to the proxy over night. now I came back, let the request back in, and am wondering that I see a low cacht hit rate. looking a bit closer it appears as if the cache got smaller over night, now the process only consumes less than 1 GB of resident memory, which fits the reported bytes allocated in the stats. can I somehow find out why my cached objects were expired? I have a varnishlog -w running all the time, the the information might there. but, what to look for, and even more important, how can I prevent that expiration? I started the daemon with -p default_ttl=31104000 to make it cache very aggresively... There could be a lot of factors, is apache setting a max-age on the items? As it says in the man page: default_ttl The default time-to-live assigned to objects if neither the backend nor the configuration assign one. Note that changes to this param- eter are not applied retroactively. Is this running on a test machine in a lab where you can control the requests this box gets? If so you should run some tests to make sure that you really are caching objects. Run wireshark on the apache server listening on port 80, and using curl send two requests for the same object, and make sure that only one request hits the apache box. If thats working like you expect, and the Age header is incrementing, then you need to run some tests using a typical workload that your apache server expects to see. Are you setting cookies on this site? I think what is happening is that you are setting a max-age on objects from apache ( which you can verify using curl, netcat, telnet, whatever you like ), and varnish is honoring that setting and expiring items as instructed. I'm not awesome with varnishtop and varnishlog yet, so I'm probably not the one to ask about getting those to show you an objects attributes, anyone care to assist on that front? --Dave ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: cache empties itself?
Michael S. Fischer wrote: On Thu, Apr 3, 2008 at 10:26 AM, Sascha Ottolski [EMAIL PROTECTED] wrote: All this with 1.1.2. It's vital to my setup to cache as many objects as possible, for a long time, and that they really stay in the cache. Is there anything I could do to prevent the cache being emptied? May be I've been bitten by a bug and should give the trunk a shot? Just set the Expires: headers on the origin (backend) server responses to now + 10 years or something. If you're not using php or some other cgi app, you can set headers using mod_headers in apache, if you are running a web app, just set the headers within the app itself. You can also explicitly set the ttl on objects in the cache using vcl code, but moving the load off the cache to the backend makes more sense since you'll be cutting traffic down to apache and it would free up cycles to modify headers. If you have your heart set on making varnish do the work you could add something like this: sub vcl_fetch { if (!obj.valid) { error; } if (!obj.cacheable ) { pass; } if (obj.http.Set-Cookie) { pass; } if (req.url ~ \.(jpg|jpeg|gif|png)$) { set obj.ttl = 31449600; } insert; } But I would first look at getting apache to set the age correctly and leave varnish to do what its good at. --Dave ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: cache empties itself?
Sascha Ottolski wrote: however, my main problem is currently that the varnish childs keep restarting, and that this empties the cache, which effectively renders the whole setup useless for me :-( if the cache has filled up, it works great, if it restarts empty, obviously it doesn't. is there anything I can do to prevent such restarts? Varnish doesn't just restart on its own. Check to make sure you aren't sending a kill signal if you are running logrotate through a cronjob. I'm not sure if a HUP will empty the cache or not. --Dave ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc