>> Peter J. Cranstone wrote... >> >> Since when does web server throughput drop by x% factor using >> mod_deflate? > > Jeff Trawick wrote... > > I don't think you need me to explain the "why" or the "when" to you.
Think again. Exactly what scenario are you assuming is supposed to be so 'obvious' that it doesn't need and explanation/discussion? There has never been a good discussion and/or presentation of real data on this topic... just a bunch of 'assumptions'... and now that compression modules have caching ability whatever testing HAS been done needs to be done again because perhaps any/all of the sore spots in anyone's testing can now be completely eliminated by real-time caching of compressed objects. All of my experience with compressing Internet Content in real time on Servers, with or without the caching of the compression objects, indicates that it USUALLY, if done correctly, does nothing but INCREASE the 'throughput' of the Server. Same experience has also shown that if something ends up being much SLOWER then something is bad WRONG with the code that's doing it and it is FIXABLE. The assumption that YOU seem to be clinging to is that once the Server has bounced through enough APR calls to handle the transaction with as few things showing up in STRACE as possible that the Server has done it's job and the transaction is OVER ( and the CPU somehow magically free again ). This is never the case. Pie is rarely free at a truck stop. If you dump 100,000 bytes into the I/O subsystem without taking the (few) milliseconds needed to compress down to 70-80 percent LESS then SOMETHING in the CPU is still working MUCH harder than it has to. The 'data' is not GONE from the box just because the Server has made some socket calls and gone about it's business. It still has to be SENT, one byte at a time, by the same CPU in the same machine. NIC cards are interrupt driven. Asking the I/O subsystem to constantly send 70-80 percent more data than it has to via an interrupt driven mechanism is basically the most expensive thing you could ask the CPU to do. In-memory compression is NOT interrupt driven. As compared to interrupt driven I/O it is one of the LEAST expensive things to ask the CPU to do, on average. Do not confuse the performance of any given standard distribution of some legacy compression library called ZLIB with whether or not, in THEORY, the real-time compression of content is able to INCREASE the throughput of the Server. ZLIB was never designed to be used as a 'real-time' compression engine. The code is VERY OLD and is still based on a streaming I/O model with heavy overhead versus direct in-memory compression. It is a FILE based implementation of LZ77 and while it performs very well in a batch job against disk files it still lacks some things which could qualify it as a high-perfomance real-time compression engine. mod_gzip does NOT use 'standard ZLIB' for this very reason. The performance was not good enough to produce consistently good throughput. >> We went through this debate with mod_gzip and it doesn't hold much >> water. Server boxes are cheap and adding some more ram or even a faster >> processor is a cheap price to pay when compared to customer satisfaction >> when their pages load faster. > > Your "Server boxes are cheap" comment is very telling; if I add more > ram or a faster processor we aren't talking about the same web server. Exactly. Regardless of the fact that content compression at the source origin CAN actually 'improve' the throughput of one single server ( if done correctly ) let me chime in on this point and say that if adding a little hardware or perhaps even another ( dirt cheap these days ) Server box is what it takes to provide a DRAMATIC improvement in the user experience then what's the gripe? If that's what it takes to provide a better experience for the USER then I agree 100% with Peter. That is what SHOULD be the focus. Your point of view seems to indicate that you believe it's better to let your USERS have a 'worse experience' than they need to just to avoid having to beef up the Server side. I have always believed that the END USER experience should be more important than how some single piece of software 'looks' on a benchmark test. Those benchmarks that produce these holy TPS ratings are usually flawed when it comes to imitating a REAL user-experience. It's a classic argument and there have always been 2 camps... Which is more important... 1. Having a minimal amount of Server to deal with/maintain and let the users suffer more than they need to. 2. Do whatever it takes to make sure all the technology that is currently available is being put into play to provide the best USER experience possible. I have always pitched my tent in camp # 2 and I think most people that are serious about hosting Web sites circle their wagons around the same camp. > But overall I agree completely that compressing content and adding > more ram and/or a faster processor as appropriate is the right thing > to do in many situations. Fantastic! This was Peter's sole point and is now mine also. It's the RIGHT thing to do. None of the fine public domain and/or commercial products that provide real-time content compression services are so bad as to render them un-usable so there really isn't much excuse to NOT use them. I recommend any/all of them. Sure... they can all get better... but so can HTTP itself. Whatever is wrong with mod_deflate can be fixed, filter I/O and/or compression engine performance included. Yours... Kevin Kiley
