Re: Varnish use for purely binary files
2010/1/19 pub crawler : > Wanted in inject another discussion heady item into this thread and > see if the idea is confirmed in other folks current architecture. > Sorry in advance for being verbose. > > Often web servers (my experience) are smaller servers, less RAM and > fewer CPUs than the app servers and databases. A typical webserver > might be a 2GB or 4GB machine with a dual CPU. But, the disk storage > on any given webserver will far exceed the RAM in the machine. This > means disk IO even when attempting to cache as much as possible in a > webserver due to the limited RAM. > > In this "normal" web server size model, simply plugging a bigger RAM > Varnish in upstream means less disk IO, faster web servers, less > memory consumption managing threads, etc. This is well proven basic > Varnish adopter model. > > Here's a concept that is not specific to the type of data being stored > in Varnish: > > With some additional hashing in the mix, you could limit your large > Varnish cache server to the very high repetitively accessed items and > use the hash to go to the backend webservers where ideally you hit a > smaller Varnish instance with the item cached on the 2-4GB webserver > downriver and have it talk to the webserver directly on localhost if > it didn't have the data. Given that you've already taken care of the common requests upstream, you are unlikely to see much benefit from any form of caching - performance will be determined by disk seek time. I suspect you would see much more of a benefit in moving to SSDs for storage. Even cheap MLC SSDs like Intel's X25-M will give great read performance. Laurence ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish use for purely binary files
On Jan 19, 2010, at 12:46 AM, Poul-Henning Kamp wrote: > In message , "Michael S. > Fis > cher" writes: > >> Does Varnish already try to utilize CPU caches efficiently by employing = >> some sort of LIFO thread reuse policy or by pinning thread pools to = >> specific CPUs? If not, there might be some opportunity for optimization = >> there. > > You should really read the varnish_perf.pdf slides I linked to yesteday... They appear to only briefly mention the LIFO issue (in one bullet point toward the end), and do not discuss the CPU affinity issue. --Michael ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish use for purely binary files
In message , "Michael S. Fis cher" writes: >Does Varnish already try to utilize CPU caches efficiently by employing = >some sort of LIFO thread reuse policy or by pinning thread pools to = >specific CPUs? If not, there might be some opportunity for optimization = >there. You should really read the varnish_perf.pdf slides I linked to yesteday... -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 p...@freebsd.org | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish use for purely binary files
Wanted in inject another discussion heady item into this thread and see if the idea is confirmed in other folks current architecture. Sorry in advance for being verbose. Often web servers (my experience) are smaller servers, less RAM and fewer CPUs than the app servers and databases. A typical webserver might be a 2GB or 4GB machine with a dual CPU. But, the disk storage on any given webserver will far exceed the RAM in the machine. This means disk IO even when attempting to cache as much as possible in a webserver due to the limited RAM. In this "normal" web server size model, simply plugging a bigger RAM Varnish in upstream means less disk IO, faster web servers, less memory consumption managing threads, etc. This is well proven basic Varnish adopter model. Here's a concept that is not specific to the type of data being stored in Varnish: With some additional hashing in the mix, you could limit your large Varnish cache server to the very high repetitively accessed items and use the hash to go to the backend webservers where ideally you hit a smaller Varnish instance with the item cached on the 2-4GB webserver downriver and have it talk to the webserver directly on localhost if it didn't have the data. Anyone doing anything remotely like this? Lots of big RAM installations for Varnish. I like the Google or mini Google model of many smaller machines distributing the load. Seem feasible? 2-4GB machines are very affordable compared to the 16GB and above machines. Certainly more collective horsepower with the individual smaller servers - perhaps a better performance-per-watt also (another one of my interests). Thanks again everyone. I enjoy hearing about all the creative ways folks are using Varnish in their very different environments. The more scenarios for Varnish, the more adoption and ideally the more resources and expertise that become available for future development. There is some sort of pruning of the cache that is beyond me at the moment to keep Varnish from being overpopulated with non used items and similarly from wasting RAM on the webservers for such. Simple concept and probably very typical. Oh yeah, plus it scales horizontally on lower cost dummy server nodes. ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish use for purely binary files
On Jan 18, 2010, at 4:35 PM, Poul-Henning Kamp wrote: > In message <97f066dd-4044-46a7-b3e1-34ce928e8...@slide.com>, Ken Brownfield > wri > tes: > >> Ironically and IMHO, one of the barriers to Varnish scalability >> is its thread model, though this problem strikes in the thousands >> of connections. > > It's only a matter of work to pool slow clients in Varnish into > eventdriven writer clusters, but so far I have not seen a > credible argument for doing it. > > A thread is pretty cheap to have around if it doesn't do anything, > and the varnish threads typically do not do anything during the > delivery-phase: They are stuck in the kernel in a writev(2) > or sendfile(2) system call. Does Varnish already try to utilize CPU caches efficiently by employing some sort of LIFO thread reuse policy or by pinning thread pools to specific CPUs? If not, there might be some opportunity for optimization there. --Michael ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish use for purely binary files
On Jan 18, 2010, at 4:15 PM, Ken Brownfield wrote: > Ironically and IMHO, one of the barriers to Varnish scalability is its thread > model, though this problem strikes in the thousands of connections. Agreed. In an early thread on varnish-misc in February 2008 I concluded that reducing thread_pool_max to well below the default value (to 16 threads/CPU) was instrumental in attaining maximum performance on high-hit-ratio workloads. (This was with Varnish 1.1; things may have changed since then but the theory remains.) Funny how there's always a tradeoff: Overcommit -> page-thrashing death Undercommit -> context-switch death :) --Michael ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish use for purely binary files
In message <97f066dd-4044-46a7-b3e1-34ce928e8...@slide.com>, Ken Brownfield wri tes: >Ironically and IMHO, one of the barriers to Varnish scalability >is its thread model, though this problem strikes in the thousands >of connections. It's only a matter of work to pool slow clients in Varnish into eventdriven writer clusters, but so far I have not seen a credible argument for doing it. A thread is pretty cheap to have around if it doesn't do anything, and the varnish threads typically do not do anything during the delivery-phase: They are stuck in the kernel in a writev(2) or sendfile(2) system call. In terms of machine resources, there is no cheaper way to do it. An important but not often spotted advantage is that the object overhead does not depend on the size of the object: a 1 megabyte object takes exactly as few resources as a 1 byte object. If you change to an eventdriven model, you will have many more system-calls, scaling O(n) with object sizes, and you will get a lot more locking in the kernel, resulting in contention on fd's and pcbs. At the higher level, you will have threads getting overwhelmed if/when we misestimate the amount of bandwidth they have to deal with, and you will need complicated code to mitigate this. For 32bit machines, having thousands of threads is an issue, because you run ou of address-space, but on a 64bit system, having 1000 threads or even 10k threads is not really an issue. Again: Don't let the fact that people have doen this simple datamoving job wrong in the past, mislead you think it cannot be done right. The trick to getting high performance, is not doing work you don't need to do, no architecture or performance trick can eve beat that. Poul-Henning -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 p...@freebsd.org | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish use for purely binary files
In message <87f6439f-76fe-416c-b750-5a53a9712...@dynamine.net>, "Michael S. Fis cher" writes: >I'm merely contending that the small amount of added = >latency for a cache hit, where neither server is operating at full = >capacity, is not enough to significantly affect the user experience. Which translated to plain english becomes: If you don't need varnish, you don't need varnish. I'm not sure how much useful information that statement contains :-) -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 p...@freebsd.org | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish use for purely binary files
On Jan 18, 2010, at 4:06 PM, Poul-Henning Kamp wrote: > In message <02d0ec1a-d0b0-40ee-b278-b57714e54...@dynamine.net>, "Michael S. > Fis > cher" writes: > >> But we are not discussing serving dynamic content in this thread >> anyway. We are talking about binary files, aren't we? Yes? Blobs >> on disk? Unless everyone is living on a different plane then me, >> then I think that's what we're talking about. >> >> For those you should be using a general purpose webserver. There's >> no reason you can't run both side by side. And I stand by my >> original statement about their performance relative to Varnish. > > Why would you use a general purpose webserver, if Varnish can > deliver 80 or 90% of your content much faster and much cheaper ? There's no question that Varnish is faster and that it can handle more peak requests per second than a general-purpose webserver at a near-100% cache hit rate. I'm merely contending that the small amount of added latency for a cache hit, where neither server is operating at full capacity, is not enough to significantly affect the user experience. There are many competing factors that need to go into the planning process other than pure peak capacity, among them the cache hit ratio, the cost of a cache miss, and where your money is better spent: installing RAM in cache servers or in origin servers. --Michael ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish use for purely binary files
On Jan 18, 2010, at 4:03 PM, Michael S. Fischer wrote: >> Does [Apache] perform "well" for static files in the absence of any other >> function? Yes. Would I choose it for anything other than an application >> server? No. There are much better solutions out there, and the proof is in >> the numbers. > > > Not sure what you mean here... at my company it's used for everything but > proxying (because Apache's process model is contraindicated at high > concurrencies if you want to support Keep-Alive connections). And we serve a > lot of traffic at very low latencies. The concurrency issue is really Apache's achilles tendon. Real world example: being limited to 70 concurrent application workers on a 16GB machine is a bad joke. Like you said, simultaneous (especially slow) connections will kill Apache dead very quickly. mpm_event could be a huge boon (if Apache insists on continuing with a pure process/thread model) but I'm not sure it's every going to "arrive". Ironically and IMHO, one of the barriers to Varnish scalability is its thread model, though this problem strikes in the thousands of connections. Apache is fast at pure static, but it isn't the fastest. nginx/lighttpd/thttpd can be simpler and faster, and they don't rely on the process/thread model. But whether or not you need that 100th percentile of speed depends on a huge number of variables. Apache's ubiquity is a strong argument, and it would take heavy loads to differentiate Apache from the others above. -- Ken > --Michael ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish use for purely binary files
In message <364f5e3e-0d1e-4c95-b101-b7a00c276...@slide.com>, Ken Brownfield wri tes: >A cache hit under Varnish will be comparable in latency to a >dedicated static server hit, regardless of the backend. Only provided the "dedicated static server" is written to work in a modern SMP/VM system, which few, if any, of them are. Poul-Henning -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 p...@freebsd.org | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish use for purely binary files
In message <02d0ec1a-d0b0-40ee-b278-b57714e54...@dynamine.net>, "Michael S. Fis cher" writes: >But we are not discussing serving dynamic content in this thread >anyway. We are talking about binary files, aren't we? Yes? Blobs >on disk? Unless everyone is living on a different plane then me, >then I think that's what we're talking about. > >For those you should be using a general purpose webserver. There's >no reason you can't run both side by side. And I stand by my >original statement about their performance relative to Varnish. Why would you use a general purpose webserver, if Varnish can deliver 80 or 90% of your content much faster and much cheaper ? It sounds to me like you have not done your homework with respect to Varnish. For your information, here is the approximate sequence of systemcalls Varnish performs for a cache hit: read(get the HTTP request) timestamp timestamp timestamp timestamp writev (write the response) With some frequency, depending on your system and OS, you will also see a few mutex operations. The difference between the first and the last timestamp is typically on the order of 10-20 microseconds. The middle to timestamps are mostly for my pleasure and could be optimized out, if they made any difference. This is why people who run synthetic benchmarks do insane amounts of req/s on varnish boxes, for values of insane >> 100.000. I suggest you look at how many systems calls and how long time your "general purpose webserver" spends doing the same job. Once you have done that, I can recommend you read the various architects notes I've written, and maybe browse through http://phk.freebsd.dk/pubs/varnish_perf.pdf Where you decide to deposit your "conventional wisdom" afterwards is for you to decide, but it is unlikely to be applicable. Poul-Henning -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 p...@freebsd.org | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish use for purely binary files
On Jan 18, 2010, at 3:54 PM, Ken Brownfield wrote: > Adding unnecessary software overhead will add latency to requests to the > filesystem, and obviously should be avoided. However, a cache in front of a > general web server will 1) cause an object miss to have additional latency > (though small) and 2) guarantee object hits will be as low as possible. A > cache in front of a dedicated static file server is unnecessary, but > worst-case would introduce additional latency only for cache misses. Agreed. This is what I was trying to communicate all along. It was my understanding that this was what the thread was about. > Does [Apache] perform "well" for static files in the absence of any other > function? Yes. Would I choose it for anything other than an application > server? No. There are much better solutions out there, and the proof is in > the numbers. Not sure what you mean here... at my company it's used for everything but proxying (because Apache's process model is contraindicated at high concurrencies if you want to support Keep-Alive connections). And we serve a lot of traffic at very low latencies. --Michael ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish use for purely binary files
> Let me clear, in case I have not been clear enough already: > > I am not talking about the edge cases of those low-concurrency, high-latency, > scripted-language webservers that are becoming tied to web application > frameworks like Rails and Django and that are the best fit for front-end > caching because they are slow at serving dynamic content. > > But we are not discussing serving dynamic content in this thread anyway. We > are talking about binary files, aren't we? Yes? Blobs on disk? Unless > everyone is living on a different plane then me, then I think that's what > we're talking about. > > For those you should be using a general purpose webserver. There's no reason > you can't run both side by side. And I stand by my original statement about > their performance relative to Varnish. Definitely wasn't clear until now. But now I'm not sure what we're discussing, since comparing the performance of a reverse-proxy cache to an origin server is rather pointless. A cache hit under Varnish will be comparable in latency to a dedicated static server hit, regardless of the backend. The rate of misses will determine whether a dedicated static server would be required, and this is a growth path that many companies follow. -- Ken > --Michael ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish use for purely binary files
On Jan 18, 2010, at 3:16 PM, Michael S. Fischer wrote: > On Jan 18, 2010, at 3:08 PM, Ken Brownfield wrote: > >> In the real world, sites run their applications through web servers, and >> this fact does (and should) guide the decision on the base web server to >> use, not static file serving. > > I meant webservers that more than 50%+ of the world uses, which do not > include those. Depends on whether you mean 50% of companies, or 50% of web property traffic. The latter? Definitely. > I was assuming, perhaps incorrectly, that the implementor would have at > least the wisdom/laziness to use a popular general-purpose webserver such as > Apache for the purpose of serving static objects from the filesystem. And > that's not even really a stretch as it's the default for most servers. This is true, though default Apache configurations vary the gamut from clean to bloated (>1ms variation I would say) > >> (Though nginx may have an on-disk cache? And don't get me started on Apache >> caching. :-) > > Doctor, heal thyself before you call me inexperienced. Using > application-level caching for serving objects from the filesystem rarely > works, which is the main point of Varnish. Just because *you* can't get good > performance out of Apache doesn't mean it's not worth using. I'm not sure what your definition of application-level is, here. Much of Apache's functionality could be considered an application. But if you mean an embedded app running "inside" Apache, then that distinction has almost no bearing on whether file serving "works" or not -- an app can serve files just as fast as Apache, assuming C/C++. Adding unnecessary software overhead will add latency to requests to the filesystem, and obviously should be avoided. However, a cache in front of a general web server will 1) cause an object miss to have additional latency (though small) and 2) guarantee object hits will be as low as possible. A cache in front of a dedicated static file server is unnecessary, but worst-case would introduce additional latency only for cache misses. I'm not sure what your comment on Apache is about, since I never said Apache isn't worth using. I've been using it in production for 11+ years now. Does it perform "well" for static files in the absence of any other function? Yes. Would I choose it for anything other than an application server? No. There are much better solutions out there, and the proof is in the numbers. -- Ken > --Michael ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish use for purely binary files
> The average workload of a cache hit, last I looked, was 7 system > calls, with typical service times, from request received from kernel > until response ready to be written to kernel, of 10-20 microseconds. Well that explains some of the performance difference in Varnish (in our experience) versus web servers. 7 calls isn't much and you said MICROSECONDS. :) > I don't know if that is THE best performance, but I know of a lot > of software doing a lot worse. I haven't done the reproduceable testing to share with everyone yet. But using 3rd party remotely hosted analysis services we know for certain our page elements are starting faster and the average object load time has gone down significantly. We are using one or more of the fast webservers and still are - just behind Varnish now :) ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish use for purely binary files
On Jan 18, 2010, at 3:47 PM, Poul-Henning Kamp wrote: > In message , "Michael S. > Fis > cher" writes: > >> That's why you don't use those webservers as origin servers for >> that purpose. But you don't use Varnish for it either. It's not >> an origin server anyway. > > Actually, for protocol purposes, Varnish is an origin server. > > If you read RFC2616 very carefully, you can find the one place where > they failed to evict server-side caches from the text, when they > realized that a cache under the control of the webmaster, is > indistinguisable from a webserver, for protocol purposes. I meant it for practical purposes, Poul-Henning. But I'm sure you knew that. :) --Michael ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish use for purely binary files
On Jan 18, 2010, at 3:37 PM, pub crawler wrote: >> Differences in latency of serving static content can vary widely based on >> the web server in use, easily tens of milliseconds or more. There are >> dozens of web servers out there, some written in interpreted languages, many >> custom-written for a specific application, many with add-ons and modules and > > Most webservers as shipped are simply not very speedy. Nginx, > Cherokee, Lighty are three exceptions :) > Latency is all over the place in web server software. Caching is a > black art still no matter where you are talking about having one or > lacking one :) Ten milliseconds is easily wasted in a web server, > connection pooling, negotiating the transfer, etc. Most sites have so > many latency issues and such a lack of performance. Let me clear, in case I have not been clear enough already: I am not talking about the edge cases of those low-concurrency, high-latency, scripted-language webservers that are becoming tied to web application frameworks like Rails and Django and that are the best fit for front-end caching because they are slow at serving dynamic content. But we are not discussing serving dynamic content in this thread anyway. We are talking about binary files, aren't we? Yes? Blobs on disk? Unless everyone is living on a different plane then me, then I think that's what we're talking about. For those you should be using a general purpose webserver. There's no reason you can't run both side by side. And I stand by my original statement about their performance relative to Varnish. --Michael ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish use for purely binary files
In message , "Michael S. Fis cher" writes: >That's why you don't use those webservers as origin servers for >that purpose. But you don't use Varnish for it either. It's not >an origin server anyway. Actually, for protocol purposes, Varnish is an origin server. If you read RFC2616 very carefully, you can find the one place where they failed to evict server-side caches from the text, when they realized that a cache under the control of the webmaster, is indistinguisable from a webserver, for protocol purposes. Poul-Henning -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 p...@freebsd.org | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish use for purely binary files
In message <4c3149fb1001181416r7cd1c1c2n923a438d6a0df...@mail.gmail.com>, pub c rawler writes: >So far Varnish is performing very well for us as a web server of these >cached objects. The connection time for an item out of Varnish is >noticeably faster than with web servers we have used - even where the >items have been cached. We are mostly using 3rd party tools like >webpagetest.org to look at the item times. The average workload of a cache hit, last I looked, was 7 system calls, with typical service times, from request received from kernel until response ready to be written to kernel, of 10-20 microseconds. Compared to the amount of work real webservers do for the same task, that is essentially nothing. I don't know if that is THE best performance, but I know of a lot of software doing a lot worse. Try running varnishhist if you have not already :-) Poul-Henning -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 p...@freebsd.org | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish use for purely binary files
> Differences in latency of serving static content can vary widely based on > the web server in use, easily tens of milliseconds or more. There are > dozens of web servers out there, some written in interpreted languages, many > custom-written for a specific application, many with add-ons and modules and Most webservers as shipped are simply not very speedy. Nginx, Cherokee, Lighty are three exceptions :) Latency is all over the place in web server software. Caching is a black art still no matter where you are talking about having one or lacking one :) Ten milliseconds is easily wasted in a web server, connection pooling, negotiating the transfer, etc. Most sites have so many latency issues and such a lack of performance. Most folks seem to just ignore it though and think all is well with low performance. That's why Varnish and the folks here are so awesome. A band of data crushers, bandwidth abusers and RAM junkies with lower latency in mind. Latency is an ugly multiplier - it gets multiplied by every request, multiple requests per user, multiplied by all the use in a period of time. If your page has 60 elements to be served and you add a mere 5ms to each element that's 300ms of latency just on serving static items. There are other scenarios too like dealing with people on slow connections (if your audience has lots of these). > If you're serving pure static content with no need for application logic, > then yes, there is little benefit to choosing a two-tier infrastructure when > a one-tier out-of-the-box nginx/lighttpd/thttpd will do just fine. But, if > your content does not fit in memory, you're back to reverse-Squid or > Varnish. (Though nginx may have an on-disk cache? And don't get me started > on Apache caching. :-) Static sites will still be aided in scaling fronting them with Varnish or similar cache front end if they are big enough. A small for instance might be offloading images or items that require longer connection timeouts to Varnish - reducing the disk IO perhaps and being able to cut your open connections on your web server. You could do the same obviously by dissecting your site into multiple servers and dividing the load ~ lose some of the functionality that is appealing in Varnish and the ability to dynamically adjust traffic, load, direction, etc. within Varnish. Unsure if anything similar exists in Nginx - but then you are turning a web server into something else and likely some performance reduction. Mind you, most people here *I think* are dealing with big scaling - busy sites, respectable and sometimes awe inspiring amounts of data. Then there are those slow as can be app servers the might have to work around too. So the scale of latency issues is a huge cost center for most folks. Plenty of papers have been wrote about latency and the user experience. The slower the load the less people interact and in commerce terms spend with the site. ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish use for purely binary files
On Jan 18, 2010, at 3:08 PM, Ken Brownfield wrote: >> I have a hard time believing that any difference in the total response time >> of a cached static object between Varnish and a general-purpose webserver >> will be statistically significant, especially considering typical Internet >> network latency. If there's any difference it should be well under a >> millisecond. > > I would suggest that you get some real-world experience, or at least do some > research in this area. Like your earlier assertion, this is patently untrue > as a general conclusion. > Differences in latency of serving static content can vary widely based on the > web server in use, easily tens of milliseconds or more. There are dozens of > web servers out there, some written in interpreted languages, many > custom-written for a specific application, many with add-ons and modules and > other hijinx that can effect the latency of serving static content. That's why you don't use those webservers as origin servers for that purpose. But you don't use Varnish for it either. It's not an origin server anyway. > In the real world, sites run their applications through web servers, and this > fact does (and should) guide the decision on the base web server to use, not > static file serving. I meant webservers that more than 50%+ of the world uses, which do not include those. I was assuming, perhaps incorrectly, that the implementor would have at least the wisdom/laziness to use a popular general-purpose webserver such as Apache for the purpose of serving static objects from the filesystem. And that's not even really a stretch as it's the default for most servers. > (Though nginx may have an on-disk cache? And don't get me started on Apache > caching. :-) Doctor, heal thyself before you call me inexperienced. Using application-level caching for serving objects from the filesystem rarely works, which is the main point of Varnish. Just because *you* can't get good performance out of Apache doesn't mean it's not worth using. --Michael ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish use for purely binary files
> I have a hard time believing that any difference in the total response time > of a cached static object between Varnish and a general-purpose webserver > will be statistically significant, especially considering typical Internet > network latency. If there's any difference it should be well under a > millisecond. I would suggest that you get some real-world experience, or at least do some research in this area. Like your earlier assertion, this is patently untrue as a general conclusion. Differences in latency of serving static content can vary widely based on the web server in use, easily tens of milliseconds or more. There are dozens of web servers out there, some written in interpreted languages, many custom-written for a specific application, many with add-ons and modules and other hijinx that can effect the latency of serving static content. Additionally, very few of these implement their own managed cache; the rest accidentally rely on filesystem cache which may or may not perform with low or predictable latency, and may not be large enough for a working set. In the real world, sites run their applications through web servers, and this fact does (and should) guide the decision on the base web server to use, not static file serving. Thus the primary importance IMHO of software like reverse-Squid and Varnish. If you're serving pure static content with no need for application logic, then yes, there is little benefit to choosing a two-tier infrastructure when a one-tier out-of-the-box nginx/lighttpd/thttpd will do just fine. But, if your content does not fit in memory, you're back to reverse-Squid or Varnish. (Though nginx may have an on-disk cache? And don't get me started on Apache caching. :-) -- Ken > --Michael ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish use for purely binary files
On Jan 18, 2010, at 2:16 PM, pub crawler wrote: >> Most kernels cache recently-accessed files in RAM, and so common web servers >> such as Apache can ?>already serve up static objects very quickly if they >> are located in the buffer cache. (Varnish's apparent >speed is largely >> based on the same phenomenon.) If the data is already cached in the origin >> server's buffer >caches, then interposing an additional caching layer may >> actually be somewhat harmful because it will add >some additional latency. > > So far Varnish is performing very well for us as a web server of these > cached objects. The connection time for an item out of Varnish is > noticeably faster than with web servers we have used - even where the > items have been cached. We are mostly using 3rd party tools like > webpagetest.org to look at the item times. > > Varnish is good as a slice in a few different place in a cluster and a > few more when running distributed geographic clusters. Aside from > Nginx or something highly optimized I am fairly certain Varnish > provides faster serving of cached objects as an out of the box default > experience. I'll eventually find some time to test it in our > environment against web servers we use. I have a hard time believing that any difference in the total response time of a cached static object between Varnish and a general-purpose webserver will be statistically significant, especially considering typical Internet network latency. If there's any difference it should be well under a millisecond. --Michael___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish use for purely binary files
> Most kernels cache recently-accessed files in RAM, and so common web servers > such as Apache can ?>already serve up static objects very quickly if they are > located in the buffer cache. (Varnish's apparent >speed is largely based on > the same phenomenon.) If the data is already cached in the origin server's > buffer >caches, then interposing an additional caching layer may actually be > somewhat harmful because it will add >some additional latency. So far Varnish is performing very well for us as a web server of these cached objects. The connection time for an item out of Varnish is noticeably faster than with web servers we have used - even where the items have been cached. We are mostly using 3rd party tools like webpagetest.org to look at the item times. > If you've evenly distributed your objects among a number of origin servers, > assuming they do nothing but >serve up these static objects, and the origin > servers have a sum total of RAM larger than your caching >servers, then you > might be better off just serving directly from the origin servers. Varnish is good as a slice in a few different place in a cluster and a few more when running distributed geographic clusters. Aside from Nginx or something highly optimized I am fairly certain Varnish provides faster serving of cached objects as an out of the box default experience. I'll eventually find some time to test it in our environment against web servers we use. > On the other hand, there are some use cases, such as edge-caching, where > interposing a caching layer >can be quite helpful even if the origin servers > are fast, because making the object available closer to the Edge caching and distributed cache front ends are exactly what's needed. It's a poor mans CDN but can be very effective if done well. The question I posed is to see if this type of use (binary almost purely) is being done and scaling well at large scale (50GB and beyond). Binary data usually poses more overhead as the data is larger - less stored elements in RAM, often it can't be compressed further, more FIFO type of purging due to this, etc. -Paul ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc
Re: Varnish use for purely binary files
On Jan 18, 2010, at 12:58 PM, pub crawler wrote: > This is an inquiry for the Varnish community. > > Wondering how many folks are using Varnish purely for binary storage > and caching (graphic files, archives, audio files, video files, etc.)? > > Interested specifically in large Varnish installations with either > high number of files or where files are large in size. > > Can anyone out there using Varnish for such care to say they are? I guess it depends on your precise configuration. Most kernels cache recently-accessed files in RAM, and so common web servers such as Apache can already serve up static objects very quickly if they are located in the buffer cache. (Varnish's apparent speed is largely based on the same phenomenon.) If the data is already cached in the origin server's buffer caches, then interposing an additional caching layer may actually be somewhat harmful because it will add some additional latency. If you've evenly distributed your objects among a number of origin servers, assuming they do nothing but serve up these static objects, and the origin servers have a sum total of RAM larger than your caching servers, then you might be better off just serving directly from the origin servers. On the other hand, there are some use cases, such as edge-caching, where interposing a caching layer can be quite helpful even if the origin servers are fast, because making the object available closer to the requestor can conserve network latency. (In fact, overcommit may be OK in this situation if the I/O queue depth is reasonably shallow if you can guarantee that any additional I/O overhead is less than network latency incurred by having to go to the origin server.) --Michael ___ varnish-misc mailing list varnish-misc@projects.linpro.no http://projects.linpro.no/mailman/listinfo/varnish-misc