Re: the edge of chaos
"J" == Justin [EMAIL PROTECTED] writes: J I received a helpful recommendation to look into "lingerd" ... J that would seem one approach to solve this issue.. but a J lingerd setup is quite different from popular recommendations. I think that's mostly because lingerd is so new. I'm sure as people experiment with it we will see it incorporated into the docs and recommended setups if it holds up.
Re: the edge of chaos (URL correction)
My bad. it is www.dslreports.com/front/example.gif Sorry for those curious enough to check the URL out. On Thu, Jan 04, 2001 at 06:10:09PM -0500, Rick Myers wrote: On Jan 04, 2001 at 17:55:54 -0500, Justin twiddled the keys to say: If you want to see what happens to actual output when this happens, check this gif: http://www.dslreports.com/front/eth0-day.gif You sure about this URL? I get a 404... Rick Myers[EMAIL PROTECTED] The Feynman Problem 1) Write down the problem. Solving Algorithm 2) Think real hard. 3) Write down the answer.
Re: the edge of chaos
Hi there, On Thu, 4 Jan 2001, Justin wrote: So dropping maxclients on the front end means you get clogged up with slow readers instead, so that isnt an option.. Try looking for Randall's posts in the last couple of weeks. He has some nice stuff you might want to have a play with. Sorry, I can't remember the thread but if you look in Geoff's DIGEST you'll find it. Thanks again Geoff! 73, Ged.
RE: the edge of chaos
-Original Message- From: G.W. Haywood [mailto:[EMAIL PROTECTED]] Sent: Thursday, January 04, 2001 10:35 AM To: Justin Cc: [EMAIL PROTECTED] Subject: Re: the edge of chaos Hi there, On Thu, 4 Jan 2001, Justin wrote: So dropping maxclients on the front end means you get clogged up with slow readers instead, so that isnt an option.. Try looking for Randall's posts in the last couple of weeks. He has some nice stuff you might want to have a play with. Sorry, I can't remember the thread but if you look in Geoff's DIGEST you'll find it. I think you mean this: http://forum.swarthmore.edu/epigone/modperl/phoorimpjun and this thread: http://forum.swarthmore.edu/epigone/modperl/zhayflimthu (which is actually a response to Justin :) Thanks again Geoff! glad to be of service :) --Geoff 73, Ged.
Re: the edge of chaos
"J" == Justin [EMAIL PROTECTED] writes: J When things get slow on the back end, the front end can fill with J 120 *requests* .. all queued for the 20 available modperl slots.. J hence long queues for service, results in nobody getting anything, You simply don't have enough horsepower to serve your load, then. Your options are: get more RAM, get faster CPU, make your application smaller by sharing more code (pretty much whatever else is in the tuning docs), or split your load across multiple machines. If your front ends are doing nothing but buffering the pages for the mod_perl backends, then you probably need to lower the ratio of frontends to back ends from your 6 to 1 to something like 3 to 1. -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Vivek Khera, Ph.D.Khera Communications, Inc. Internet: [EMAIL PROTECTED] Rockville, MD +1-240-453-8497 AIM: vivekkhera Y!: vivek_khera http://www.khera.org/~vivek/
Re: the edge of chaos
Hi, Thanks for the links! But. I wasnt sure what in the first link was useful for this problem, and, the vacuum bots discussion is really a different topic. I'm not talking of vacuum bot load. This is real world load. Practical experiments (ok - the live site :) convinced me that the well recommended modperl setup of fe/be suffer from failure and much wasted page production when load rises just a little above *maximum sustainable throughput* .. If you want to see what happens to actual output when this happens, check this gif: http://www.dslreports.com/front/eth0-day.gif From 11am to 4pm (in the jaggie middle secton delineated by the red bars) I was madly doing sql server optimizations to get my head above water.. just before 11am, response time was sub-second. (That whole day represents about a million pages). Minutes after 11am, response rose fast to 10-20 seconds and few people would wait that long, they just hit stop.. (which doesnt provide my server any relief from their request). By 4pm I'd got the SQL server able to cope with current load, and everything was fine after that.. This is all moot if you never plan to get anywhere near max throughput.. nevertheless.. as a business, if incoming load does rise (hopefully because of press) I'd rather lose 20% of visitors to a "sluggish" site, than lose 100% of visitors because the site is all but dead.. I received a helpful recommendation to look into "lingerd" ... that would seem one approach to solve this issue.. but a lingerd setup is quite different from popular recommendations. -Justin On Thu, Jan 04, 2001 at 11:06:35AM -0500, Geoffrey Young wrote: -Original Message- From: G.W. Haywood [mailto:[EMAIL PROTECTED]] Sent: Thursday, January 04, 2001 10:35 AM To: Justin Cc: [EMAIL PROTECTED] Subject: Re: the edge of chaos Hi there, On Thu, 4 Jan 2001, Justin wrote: So dropping maxclients on the front end means you get clogged up with slow readers instead, so that isnt an option.. Try looking for Randall's posts in the last couple of weeks. He has some nice stuff you might want to have a play with. Sorry, I can't remember the thread but if you look in Geoff's DIGEST you'll find it. I think you mean this: http://forum.swarthmore.edu/epigone/modperl/phoorimpjun and this thread: http://forum.swarthmore.edu/epigone/modperl/zhayflimthu (which is actually a response to Justin :) Thanks again Geoff! glad to be of service :) --Geoff 73, Ged. -- Justin Beech http://www.dslreports.com Phone:212-269-7052 x252 FAX inbox: 212-937-3800 mailto:[EMAIL PROTECTED] --- http://dslreports.com/contacts
Re: the edge of chaos
I need more horsepower. Yes I'd agree with that ! However... which web solution would you prefer: A. (ideal) load equals horsepower: all requests serviced in =250ms load slightly more than horsepower: linear falloff in response time, as a function of % overload ..or.. B. (modperl+front end) load equals horsepower: all requests serviced in =250ms sustained load *slightly* more than horsepower site too slow to be usable by anyone, few seeing pages Don't all benchmarks (of disk, webservers, and so on), always continue increasing load well past optimal levels, to check there are no nasty surprises out there.. ? regards -justin On Thu, Jan 04, 2001 at 11:10:25AM -0500, Vivek Khera wrote: "J" == Justin [EMAIL PROTECTED] writes: J When things get slow on the back end, the front end can fill with J 120 *requests* .. all queued for the 20 available modperl slots.. J hence long queues for service, results in nobody getting anything, You simply don't have enough horsepower to serve your load, then. Your options are: get more RAM, get faster CPU, make your application smaller by sharing more code (pretty much whatever else is in the tuning docs), or split your load across multiple machines. If your front ends are doing nothing but buffering the pages for the mod_perl backends, then you probably need to lower the ratio of frontends to back ends from your 6 to 1 to something like 3 to 1. -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Vivek Khera, Ph.D.Khera Communications, Inc. Internet: [EMAIL PROTECTED] Rockville, MD +1-240-453-8497 AIM: vivekkhera Y!: vivek_khera http://www.khera.org/~vivek/ -- Justin Beech http://www.dslreports.com Phone:212-269-7052 x252 FAX inbox: 212-937-3800 mailto:[EMAIL PROTECTED] --- http://dslreports.com/contacts
Re: the edge of chaos
i see 2 things here, classic queing problem, and the fact that swapping to disk is 1000's of times slower than serving from ram. if you receive 100 requests per second but only have the ram to serve 99, then swapping to disc occurs which slows down the entire system. the next second comes and 100 new requests come in, plus the 1 you had in the queue that did not get serviced in the previous second. after a little while, your memory requirements start to soar, lots of swapping is occuring, and requests are coming in at a higher rate than can be serviced by an ever slowing machine. this leads to a rapid downward spiral. you must have enough ram to service all the apache processes that are allowed to run at one time. its been my experience that once swapping starts to occur, the whole thing is going to spiral downward very quickly. you either need to add more ram, to service that amount of apache processes that need to be running simultaneously, or you need to reduce MaxClients and let apache turn away requests. -- ___cliff [EMAIL PROTECTED]http://www.genwax.com/ P.S. used your service several times with good results! (and no waiting) thanks! Justin wrote: I need more horsepower. Yes I'd agree with that ! However... which web solution would you prefer: A. (ideal) load equals horsepower: all requests serviced in =250ms load slightly more than horsepower: linear falloff in response time, as a function of % overload ..or.. B. (modperl+front end) load equals horsepower: all requests serviced in =250ms sustained load *slightly* more than horsepower site too slow to be usable by anyone, few seeing pages Don't all benchmarks (of disk, webservers, and so on), always continue increasing load well past optimal levels, to check there are no nasty surprises out there.. ? regards -justin On Thu, Jan 04, 2001 at 11:10:25AM -0500, Vivek Khera wrote: "J" == Justin [EMAIL PROTECTED] writes: J When things get slow on the back end, the front end can fill with J 120 *requests* .. all queued for the 20 available modperl slots.. J hence long queues for service, results in nobody getting anything, You simply don't have enough horsepower to serve your load, then. Your options are: get more RAM, get faster CPU, make your application smaller by sharing more code (pretty much whatever else is in the tuning docs), or split your load across multiple machines. If your front ends are doing nothing but buffering the pages for the mod_perl backends, then you probably need to lower the ratio of frontends to back ends from your 6 to 1 to something like 3 to 1. -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Vivek Khera, Ph.D.Khera Communications, Inc. Internet: [EMAIL PROTECTED] Rockville, MD +1-240-453-8497 AIM: vivekkhera Y!: vivek_khera http://www.khera.org/~vivek/ -- Justin Beech http://www.dslreports.com Phone:212-269-7052 x252 FAX inbox: 212-937-3800 mailto:[EMAIL PROTECTED] --- http://dslreports.com/contacts
Re: the edge of chaos
Justin wrote: Thanks for the links! But. I wasnt sure what in the first link was useful for this problem, and, the vacuum bots discussion is really a different topic. I'm not talking of vacuum bot load. This is real world load. Practical experiments (ok - the live site :) convinced me that the well recommended modperl setup of fe/be suffer from failure and much wasted page production when load rises just a little above *maximum sustainable throughput* .. The fact that mod_proxy doesn't disconnect from the backend server when the client goes away is definitely a problem. I remember some discussion about this before but I don't think there was a solution for it. I think Vivek was correct in pointing out that your ultimate problem is the fact that your system is not big enough for the load you're getting. If you can't upgrade your system to safely handle the load, one approach is to send some people away when the server gets too busy and provide decent service to the ones you do allow through. You can try lowering MaxClients on the proxy to help with this. Then any requests going over that limit will get queued by the OS and you'll never see them if the person on the other end gets tired of waiting and cancels. It's tricky though, because you don't want a bunch of slow clients to tie up all of your proxy processes. It's easy to adapt the existing mod_perl throttling handlers to send a short static "too busy" page when there are more than a certain number of concurrent requests on the site. Better to do this on the proxy side though, so maybe mod_throttle could do it for you. - Perrin
Re: the edge of chaos
- Original Message - From: "Justin" [EMAIL PROTECTED] To: "Geoffrey Young" [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Thursday, January 04, 2001 4:55 PM Subject: Re: the edge of chaos Practical experiments (ok - the live site :) convinced me that the well recommended modperl setup of fe/be suffer from failure and much wasted page production when load rises just a little above *maximum sustainable throughput* .. It doesn't take much math to realize that if you continue to try to accept connections faster than you can service them, the machine is going to die, and as soon as you load the machine to the point that you are swapping/paging memory to disk the time to service a request will skyrocket. Tune down MaxClients on both the front and back end httpd's to what the machine can actually handle and bump up the listen queue if you want to try to let the requests connect and wait for a process to handle them. If you aren't happy with the speed the machine can realistically produce, get another one (or more) and let the front end proxy to the other(s) running the backends. Les Mikesell [EMAIL PROTECTED]
the edge of chaos
Hi, and happy new year! My modperl/mysql setup does not degrade gracefully when reaching and pushing maximum pages per second :-) if you could plot throughput, it rises to ceiling, then collapses to half or less, then slowly recovers .. rinse and repeat.. during the collapses, nobody but real patient people are getting anything.. most page production is wasted: goes from modperl--modproxy--/dev/null I know exactly why .. it is because of a long virtual "request queue" enabled by the front end .. people "leave the queue" but their requests do not.. pressing STOP on the browser does not seem to signal mod_proxy to cancel its pending request, or modperl, to cancel its work, if it has started.. (in fact if things get real bad, you can even find much of your backend stuck in a "R" state waiting for the Apache timeout variable to tick down to zero..) Any thoughts on solving this? Am I wrong in wishing that STOP would function through all the layers? thanks -Justin
Re: the edge of chaos
this is not the solution... but it could be a bandaid until you find one. set the MaxClients # lower. # Limit on total number of servers running, i.e., limit on the number # of clients who can simultaneously connect --- if this limit is ever # reached, clients will be LOCKED OUT, so it should NOT BE SET TOO LOW. # It is intended mainly as a brake to keep a runaway server from taking # the system with it as it spirals down... # MaxClients 150 On Wed, Jan 03, 2001 at 10:25:04PM -0500, Justin wrote: Hi, and happy new year! My modperl/mysql setup does not degrade gracefully when reaching and pushing maximum pages per second :-) if you could plot throughput, it rises to ceiling, then collapses to half or less, then slowly recovers .. rinse and repeat.. during the collapses, nobody but real patient people are getting anything.. most page production is wasted: goes from modperl--modproxy--/dev/null I know exactly why .. it is because of a long virtual "request queue" enabled by the front end .. people "leave the queue" but their requests do not.. pressing STOP on the browser does not seem to signal mod_proxy to cancel its pending request, or modperl, to cancel its work, if it has started.. (in fact if things get real bad, you can even find much of your backend stuck in a "R" state waiting for the Apache timeout variable to tick down to zero..) Any thoughts on solving this? Am I wrong in wishing that STOP would function through all the layers? thanks -Justin Thanks, Jeff --- | "0201: Keyboard Error. Press F1 to continue." | | -- IBM PC-XT Rom, 1982 | --- | Jeff Sheffield | | [EMAIL PROTECTED] | | AIM=JeffShef| ---
Re: the edge of chaos
Yep, I am familiar with MaxClients .. there are two backend servers of 10 modperl processes each (Maxclients=start=10). Thats sized about right. They can all pump away at the same time doing about 20 pages per second. The problem comes when they are asked to do 21 pages per second :-) There is one frontend mod_proxy.. currently has MaxClients set to 120 processes (it doesnt serve images).. the actual number in use near peak output varies from 60 to 100, depending on the mix of clients using the system. Keepalive is *off* on that, (again, since it doesnt serve images). When things get slow on the back end, the front end can fill with 120 *requests* .. all queued for the 20 available modperl slots.. hence long queues for service, results in nobody getting anything, results in a dead site. I don't mind performance limits, just don't like the idea that pushing beyond 100% (which can even happen with one of the evil site hoovers hitting you) results in site death. So dropping maxclients on the front end means you get clogged up with slow readers instead, so that isnt an option.. -Justin On Wed, Jan 03, 2001 at 11:57:17PM -0600, Jeff Sheffield wrote: this is not the solution... but it could be a bandaid until you find one. set the MaxClients # lower. # Limit on total number of servers running, i.e., limit on the number # of clients who can simultaneously connect --- if this limit is ever # reached, clients will be LOCKED OUT, so it should NOT BE SET TOO LOW. # It is intended mainly as a brake to keep a runaway server from taking # the system with it as it spirals down... # MaxClients 150 On Wed, Jan 03, 2001 at 10:25:04PM -0500, Justin wrote: Hi, and happy new year! My modperl/mysql setup does not degrade gracefully when reaching and pushing maximum pages per second :-) if you could plot throughput, it rises to ceiling, then collapses to half or less, then slowly recovers .. rinse and repeat.. during the collapses, nobody but real patient people are getting anything.. most page production is wasted: goes from modperl--modproxy--/dev/null I know exactly why .. it is because of a long virtual "request queue" enabled by the front end .. people "leave the queue" but their requests do not.. pressing STOP on the browser does not seem to signal mod_proxy to cancel its pending request, or modperl, to cancel its work, if it has started.. (in fact if things get real bad, you can even find much of your backend stuck in a "R" state waiting for the Apache timeout variable to tick down to zero..) Any thoughts on solving this? Am I wrong in wishing that STOP would function through all the layers? thanks -Justin Thanks, Jeff --- | "0201: Keyboard Error. Press F1 to continue." | | -- IBM PC-XT Rom, 1982 | --- | Jeff Sheffield | | [EMAIL PROTECTED] | | AIM=JeffShef| ---