Re: [PERFORM] Restricting Postgres
Matt - Very interesting information about squid effectiveness, thanks. Martin, You mean your site had no images? No CSS files? No JavaScript files? Nearly everything is dynamic? I've found that our CMS spends more time sending a 23KB image to a dial up user than it does generating and serving dynamic content. This means that if you have a light squid process who caches and serves your images and static content from it's cache then your apache processes can truly focus on only the dynamic data. Case in point: A first time visitor hits your home page. A dynamic page is generated (in about 1 second) and served (taking 2 more seconds) which contains links to 20 additional files (images, styles and etc). Then expensive apache processes are used to serve each of those 20 files, which takes an additional 14 seconds. Your precious application server processes have now spent 14 seconds serving stuff that could have been served by an upstream cache. I am all for using upstream caches and SSL accelerators to take the load off of application servers. My apache children often take 16 or 20MB of RAM each. Why spend all of that on a 1.3KB image? Just food for thought. There are people who use proxying in apache to redirect expensive tasks to other servers that are dedicated to just one heavy challenge. In that case you likely do have 99% dynamic content. Matthew Nuzum | Makers of Elite Content Management System www.followers.net | View samples of Elite CMS in action [EMAIL PROTECTED] | http://www.followers.net/portfolio/ -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Martin Foster Matt Clark wrote: In addition we (as _every_ high load site should) run Squid as an accelerator, which dramatically increases the number of client connections that can be handled. Across 2 webservers at peak times we've had 50,000 concurrently open http https client connections to Squid, with 150 Apache children doing the work that squid can't (i.e. all the dynamic stuff), and PG (on a separate box of course) whipping through nearly 800 mixed selects, inserts and updates per second - and then had to restart Apache on one of the servers for a config change... Not a problem :-) One little tip - if you run squid on the same machine as apache, and use a dual-proc box, then because squid is single-threaded it will _never_ take more than half the CPU - nicely self balancing in a way. M I've heard of the merits of Squid in the use as a reverse proxy. However, well over 99% of my traffic is dynamic, hence why I may be experiencing behavior that people normally do not expect. As I have said before in previous threads, the scripts are completely database driven and at the time the database averaged 65 queries per second under MySQL before a migration, while the webserver was averaging 2 to 4. Martin Foster Creator/Designer Ethereal Realms [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] Restricting Postgres
To what extent would your problems be solved by having a 2nd server, a replication system (such as slony-1, but there are others), and some sort of load-balancer in front of it? The load-balancing could be as simple as round-robin DNS server, perhaps... Then when you need to do maintenance such a vacuum full, you can temporarily take 1 server out of the load-balancer (I hope) and do maintenance, and then the other. I don't know what that does to replication, but I would venture that replication systems should be designed to handle a node going offline. Load balancing could also help to protect against server-overload and 1 server toppling over. Of course, I don't know to what extent having another piece of hardware is an option, for you. cheers, --Tim -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Martin Foster Sent: Friday, November 05, 2004 3:50 AM To: [EMAIL PROTECTED] Subject: Re: [PERFORM] Restricting Postgres [...] Now is there an administrative command in PostgreSQL that will cause it to move into some sort of maintenance mode? For me that could be exceedingly useful as it would still allow for an admin connection to be made and run a VACUUM FULL and such. Martin Foster Creator/Designer Ethereal Realms [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Restricting Postgres
I have a dual processor system that can support over 150 concurrent connections handling normal traffic and load. Now suppose I setup Apache to spawn all of it's children instantly, what will ... This will spawn 150 children in a short order of time and as this takes Doctor, it hurts when I do this! Well, don't do that then... Sorry, couldn't resist ;-) Our Apache/PG driven website also needs to be able to deal with occasional large peaks, so what we do is: StartServers 15 # Don't create too many children initially MinSpareServers 10 # Always have at least 10 spares lying around MaxSpareServers 20 # But no more than 20 MaxClients 150 # Up to 150 - the default 256 is too much for our RAM So on server restart 15 Apache children are created, then one new child every second up to a maximum of 150. Apache's 'ListenBackLog' is around 500 by default, so there's plenty of scope for queuing inbound requests while we wait for sufficient children to be spawned. In addition we (as _every_ high load site should) run Squid as an accelerator, which dramatically increases the number of client connections that can be handled. Across 2 webservers at peak times we've had 50,000 concurrently open http https client connections to Squid, with 150 Apache children doing the work that squid can't (i.e. all the dynamic stuff), and PG (on a separate box of course) whipping through nearly 800 mixed selects, inserts and updates per second - and then had to restart Apache on one of the servers for a config change... Not a problem :-) One little tip - if you run squid on the same machine as apache, and use a dual-proc box, then because squid is single-threaded it will _never_ take more than half the CPU - nicely self balancing in a way. M ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] Restricting Postgres
On Wed, 2004-11-03 at 21:25, Martin Foster wrote: Simon Riggs wrote: On Tue, 2004-11-02 at 23:52, Martin Foster wrote: Is there a way to restrict how much load a PostgreSQL server can take before dropping queries in order to safeguard the server?I was looking at the login.conf (5) man page and while it allows me to limit by processor time this seems to not fit my specific needs. Essentially, I am looking for a sort of functionality similar to what Sendmail and Apache have. Once the load of the system reaches a certain defined limit the daemon drops tasks until such a time that it can resume normal operation. Sounds great... could you give more shape to the idea, so people can comment on it? What limit? Measured how? Normal operation is what? Drop what? How to tell? Let's use the example in Apache, there is the Apache::LoadAvgLimit mod_perl module which allows one to limit based on the system load averages. Here is an example of the configuration one would find: Location /perl PerlInitHandler Apache::LoadAvgLimit PerlSetVar LoadAvgLimit_1 3.00 PerlSetVar LoadAvgLimit_5 2.00 PerlSetVar LoadAvgLimit_15 1.50 PerlSetVar LoadAvgRetryAfter 120 /Location The end state is simple, once the load average moves above 3.00 for the 1 minute average the web server will not process the CGI scripts or mod_perl applications under that directory. Instead it will return a 503 error and save the system from being crushed by ever increasing load averages. Only once the load average is below the defined limits will the server process requests as normal. This is not necessarily the nicest or cleanest way or doing things, but it does allow the Apache web server to prevent a collapse. There are ways of restricting the size of files, number of concurrent processes and even memory being used by a daemon. This can be done through ulimit or the login.conf file if your system supports it. However, there is no way to restrict based on load averages, only processor time which is ineffective for a perpetually running daemon like PostgreSQL has. All workloads are not created equally, so mixing them can be tricky. This will be better in 8.0 because seq scans don't spoil the cache. Apache is effectively able to segregate the workloads because each workload is in a directory. SQL isn't stored anywhere for PostgreSQL to say just those ones please, so defining which statements are in which workload is the tricky part. PostgreSQL workload management could look at userid, tables, processor load (?) and estimated cost to decide what to do. There is a TODO item on limiting numbers of connections per userid/group, in addition to the max number of sessions per server. Perhaps the easiest way would be to have the Apache workloads segregated by PostgreSQL userid, then limit connections to each. For example using Apache::DBI or pgpool the DBMS may be required to spawn a great deal of child processed in a short order of time. This in turn can cause a major spike in processor load and if unchecked by running as high demand queries the system can literally increase in load until the server buckles. That's been nicely covered off by John and Matt on the other threads, so you're sorted out for now and doesn't look like a bug in PostgreSQL. Of course, I am not blaming PostgreSQL, there are probably some instabilities in the AMD64 port of FreeBSD 5.2.1 for dual processor systems that lead to an increased chance of failure instead of recovery. Good! -- Best Regards, Simon Riggs ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Restricting Postgres
Kevin Barnard wrote: I am generally interested in a good solution for this. So far our solution has been to increase the hardware to the point of allowing 800 connections to the DB. I don't have the mod loaded for Apache, but we haven't had too many problems there. The site is split pretty good between dynamic and non-dynamic, it's largely Flash with several plugins to the DB. However we still can and have been slammed and up to point of the 800 connections. What I don't get is why not use pgpool? This should eliminate the rapid fire forking of postgres instanaces in the DB server. I'm assuming you app can safely handle a failure to connect to the DB (i.e. exceed number of DB connections). If not it should be fairly simple to send a 503 header when it's unable to get the connection. Note, that I am not necessarily looking for a PostgreSQL solution to the matter. Just a way to prevent the database from killing off the server it sits on, but looking at the load averages. I have attempted to make use of pgpool and have had some very poor performance. There were constant error messages being sounded, load averages on that machine seemed to skyrocket and it just seemed to not be suited for my needs. Apache::DBI overall works better to what I require, even if it is not a pool per sey. Now if pgpool supported variable rate pooling like Apache does with it's children, it might help to even things out. That and you'd still get the spike if you have to start the webserver and database server at or around the same time. Martin Foster Creator/Designer Ethereal Realms [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] Restricting Postgres
Apache::DBI overall works better to what I require, even if it is not a pool per sey. Now if pgpool supported variable rate pooling like Apache does with it's children, it might help to even things out. That and you'd still get the spike if you have to start the webserver and database server at or around the same time. I still don't quite get it though - you shouldn't be getting more than one child per second being launched by Apache, so that's only one PG postmaster per second, which is really a trivial load. That is unless you have 'StartServers' set high, in which case the 'obvious' answer is to lower it. Are you launching multiple DB connections per Apache process as well? ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Restricting Postgres
Case in point: A first time visitor hits your home page. A dynamic page is generated (in about 1 second) and served (taking 2 more seconds) which contains links to 20 additional The gain from an accelerator is actually even more that that, as it takes essentially zero seconds for Apache to return the generated content (which in the case of a message board could be quite large) to Squid, which can then feed it slowly to the user, leaving Apache free again to generate another page. When serving dialup users large dynamic pages this can be a _huge_ gain. I think Martin's pages (dimly recalling another thread) take a pretty long time to generate though, so he may not see quite such a significant gain. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Restricting Postgres
Myself, I like a small Apache with few modules serving static files (no dynamic content, no db connections), and with a mod_proxy on a special path directed to another Apache which generates the dynamic pages (few processes, persistent connections...) You get the best of both, static files do not hog DB connections, and the second apache sends generated pages very fast to the first which then trickles them down to the clients. Case in point: A first time visitor hits your home page. A dynamic page is generated (in about 1 second) and served (taking 2 more seconds) which contains links to 20 additional The gain from an accelerator is actually even more that that, as it takes essentially zero seconds for Apache to return the generated content (which in the case of a message board could be quite large) to Squid, which can then feed it slowly to the user, leaving Apache free again to generate another page. When serving dialup users large dynamic pages this can be a _huge_ gain. I think Martin's pages (dimly recalling another thread) take a pretty long time to generate though, so he may not see quite such a significant gain. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Restricting Postgres
Matt Clark wrote: Case in point: A first time visitor hits your home page. A dynamic page is generated (in about 1 second) and served (taking 2 more seconds) which contains links to 20 additional The gain from an accelerator is actually even more that that, as it takes essentially zero seconds for Apache to return the generated content (which in the case of a message board could be quite large) to Squid, which can then feed it slowly to the user, leaving Apache free again to generate another page. When serving dialup users large dynamic pages this can be a _huge_ gain. I think Martin's pages (dimly recalling another thread) take a pretty long time to generate though, so he may not see quite such a significant gain. Correct the 75% of all hits are on a script that can take anywhere from a few seconds to a half an hour to complete.The script essentially auto-flushes to the browser so they get new information as it arrives creating the illusion of on demand generation. A squid proxy would probably cause severe problems when dealing with a script that does not complete output for a variable rate of time. As for images, CSS, javascript and such the site makes use of it, but in the grand scheme of things the amount of traffic they tie up is literally inconsequential. Though I will probably move all of that onto another server just to allow the main server the capabilities of dealing with almost exclusively dynamic content. Martin Foster Creator/Designer Ethereal Realms [EMAIL PROTECTED] ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Restricting Postgres
Matt Clark wrote: Apache::DBI overall works better to what I require, even if it is not a pool per sey. Now if pgpool supported variable rate pooling like Apache does with it's children, it might help to even things out. That and you'd still get the spike if you have to start the webserver and database server at or around the same time. I still don't quite get it though - you shouldn't be getting more than one child per second being launched by Apache, so that's only one PG postmaster per second, which is really a trivial load. That is unless you have 'StartServers' set high, in which case the 'obvious' answer is to lower it. Are you launching multiple DB connections per Apache process as well? I have start servers set to a fairly high limit. However this would make little different overall if I restarted the webservers to load in new modules during a high load time.When I am averaging 145 concurrent connections before a restart, I can expect that many request to hit the server once Apache begins to respond. As a result, it will literally cause a spike on both machines as new connections are initiated at a high rate. In my case I don't always have the luxury of waiting till 0300 just to test a change. Again, not necessarily looking for a PostgreSQL solution. I am looking for a method that would allow the database or the OS itself to protect the system it's hosted on.If both the database and the apache server were on the same machine this type of scenario would be unstable to say the least. Martin Foster Creator/Designer Ethereal Realms [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Restricting Postgres
Correct the 75% of all hits are on a script that can take anywhere from a few seconds to a half an hour to complete.The script essentially auto-flushes to the browser so they get new information as it arrives creating the illusion of on demand generation. This is more like a streaming data server, which is a very different beast from a webserver, and probably better suited to the job. Usually either multithreaded or single-process using select() (just like Squid). You could probably build one pretty easily. Using a 30MB Apache process to serve one client for half an hour seems like a hell of a waste of RAM. A squid proxy would probably cause severe problems when dealing with a script that does not complete output for a variable rate of time. No, it's fine, squid gives it to the client as it gets it, but can receive from the server faster. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Restricting Postgres
On Thu, 4 Nov 2004 18:20:18 -, Matt Clark [EMAIL PROTECTED] wrote: Correct the 75% of all hits are on a script that can take anywhere from a few seconds to a half an hour to complete.The script essentially auto-flushes to the browser so they get new information as it arrives creating the illusion of on demand generation. Er, do you mean that : 1- You have a query that runs for half an hour and you spoon feed the results to the client ? (argh) 2- Your script looks for new data every few seconds, sends a packet, then sleeps, and loops ? If it's 2 I have a readymade solution for you, just ask. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Restricting Postgres
Matt Clark wrote: Correct the 75% of all hits are on a script that can take anywhere from a few seconds to a half an hour to complete.The script essentially auto-flushes to the browser so they get new information as it arrives creating the illusion of on demand generation. This is more like a streaming data server, which is a very different beast from a webserver, and probably better suited to the job. Usually either multithreaded or single-process using select() (just like Squid). You could probably build one pretty easily. Using a 30MB Apache process to serve one client for half an hour seems like a hell of a waste of RAM. These are CGI scripts at the lowest level, nothing more and nothing less. While I could probably embed a small webserver directly into the perl scripts and run that as a daemon, it would take away the portability that the scripts currently offer. This should be my last question on the matter, does squid report the proper IP address of the client themselves?That's a critical requirement for the scripts. Martin Foster Creator/Designer Ethereal Realms [EMAIL PROTECTED] ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Restricting Postgres
On Thu, Nov 04, 2004 at 03:30:19PM -0500, Martin Foster wrote: This should be my last question on the matter, does squid report the proper IP address of the client themselves?That's a critical requirement for the scripts. AFAIK it's in some header; I believe they're called X-Forwarded-For. If you're using caching, your script will obviously be called fewer times than usual, though, so be careful about relying too much on side effects. :-) (This is, of course, exactly the same if the client side uses a caching proxy. Saying anything more is impossible without knowing exactly what you are doing, though :-) ) /* Steinar */ -- Homepage: http://www.sesse.net/ ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Restricting Postgres
1- You have a query that runs for half an hour and you spoon feed the results to the client ? (argh) 2- Your script looks for new data every few seconds, sends a packet, then sleeps, and loops ? If it's 2 I have a readymade solution for you, just ask. I'm guessing (2) - PG doesn't give the results of a query in a stream. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Restricting Postgres
These are CGI scripts at the lowest level, nothing more and nothing less. While I could probably embed a small webserver directly into the perl scripts and run that as a daemon, it would take away the portability that the scripts currently offer. If they're CGI *scripts* then they just use the CGI environment, not Apache, so a daemon that accepts the inbound connections, then compiles the scripts a-la Apache::Registry, but puts each in a separate thread would be, er, relatively easy for someone better at multithreaded stuff than me. This should be my last question on the matter, does squid report the proper IP address of the client themselves?That's a critical requirement for the scripts. In the X-Forwarded-For header. Not that you can be sure you're seeing the true client IP anyway if they've gone through an ISP proxy beforehand. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Restricting Postgres
I'm guessing (2) - PG doesn't give the results of a query in a stream. In 1- I was thinking about a cursor... but I think his problem is more like 2- In that case one can either code a special purpose server or use the following hack : In your webpage include an iframe with a Javascript to refresh it every five seconds. The iframe fetches a page from the server which brings in the new data in form of generated JavaScript which writes in the parent window. Thus, you get a very short request every 5 seconds to fetch new data, and it is displayed in the client's window very naturally. I've used this technique for another application and find it very cool. It's for selection lists, often you'll see a list of things to be checked or not, which makes a big form that people forget to submit. Thus I've replaced the checkboxes with clickable zones which trigger the loading of a page in a hidden iframe, which does appropriate modifications in the database, and updates the HTML in the parent page, changing texts here and there... it feels a bit like it's not a webpage but rather a standard GUI. Very neat. Changes are recorded without needing a submit button... I should write a framework for making that easy to do. I did not use a frame because frames suck, but iframes are convenient. Yeah, it does not work with Lynx... it needs JavaScript... but it works well. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Restricting Postgres
In your webpage include an iframe with a Javascript to refresh it every five seconds. The iframe fetches a page from the server which brings in the new data in form of generated JavaScript which writes in the parent window. Thus, you get a very short request every 5 seconds to fetch new data, and it is displayed in the client's window very naturally. ... Yup. If you go the JS route then you can do even better by using JS to load data into JS objects in the background and manipulate the page content directly, no need for even an Iframe. Ignore the dullards who have JS turned off - it's essential for modern web apps, and refusing JS conflicts absolutely with proper semantic markup. http://developer.apple.com/internet/webcontent/xmlhttpreq.html is a good starting point. It's clear that this discussion has moved way away from PG! Although in the context of DB backed web apps I guess in remains a bit on-topic... M ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Restricting Postgres
On Thu, Nov 04, 2004 at 22:37:06 +, Matt Clark [EMAIL PROTECTED] wrote: ... Yup. If you go the JS route then you can do even better by using JS to load data into JS objects in the background and manipulate the page content directly, no need for even an Iframe. Ignore the dullards who have JS turned off - it's essential for modern web apps, and refusing JS conflicts absolutely with proper semantic markup. Javascript is too powerful to turn for any random web page. It is only essential for web pages because people write their web pages to only work with javascript. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Restricting Postgres
Javascript is too powerful to turn for any random web page. It is only essential for web pages because people write their web pages to only work with javascript. Hmm... I respectfully disagree. It is so powerful that it is impossible to ignore when implementing a sophisticated app. And it is not dangerous to the user so long as they have a popup blocker. Commercially, I can ignore the people who turn it off, and I can gain a huge benefit from knowing that 95% of people have it turned on, because it gives my users a hugely better experience than the equivalent XHTML only page (which I deliver, and which works, but which is a fairly depressing experience compared to the JS enabled version). It is _amazing_ how much crud you can take out of a page if you let JS do the dynamic stuff (with CSS still in full control of the styling). Nice, clean, semantically sensible XHTML, that can be transformed for multiple devices - it's great. An example: a class=preview_link/previews/foo.wmv/a But we want it to appear in a popup when viewed in certain devices Easy - Attach an 'onclick' event handler (or just set the target attribute) when the device has a suitable screen media player, but leave the markup clean for the rest of the world. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Restricting Postgres
check this marvelus piece of 5 minutes of work : http://boutiquenumerique.com/test/iframe_feed.html Yup. If you go the JS route then you can do even better by using JS to load data into JS objects in the background and manipulate the page content directly, no need for even an Iframe. Ignore the dullards who have JS turned off - it's essential for modern web apps, and refusing JS conflicts absolutely with proper semantic markup. http://developer.apple.com/internet/webcontent/xmlhttpreq.html is a good starting point. Didn't know this existed ! Very, very cool. I have to check this out more in depth. A note though : you'll have to turn off HTTP persistent connections in your server (not in your proxy) or youre back to square one. It's clear that this discussion has moved way away from PG! Although in the context of DB backed web apps I guess in remains a bit on-topic... I find it very on-topic as - it's a way to help this guy solve his pg problem which was iin fact a design problem - it's the future of database driven web apps (no more reloading the whole page !) I think in the future there will be a good bit of presentation login in the client... ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Restricting Postgres
A note though : you'll have to turn off HTTP persistent connections in your server (not in your proxy) or youre back to square one. I hadn't considered that. On the client side it would seem to be up to the client whether to use a persistent connection or not. If it does, then yeah, a request every 5 seconds would still just hold open a server. One more reason to use a proxy I s'pose. It's clear that this discussion has moved way away from PG! Although in the context of DB backed web apps I guess in remains a bit on-topic... I find it very on-topic as - it's a way to help this guy solve his pg problem which was iin fact a design problem - it's the future of database driven web apps (no more reloading the whole page !) I think in the future there will be a good bit of presentation login in the client... Not if Bruno has his way ;-) ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Restricting Postgres
Pierre-Frédéric Caillaud wrote: check this marvelus piece of 5 minutes of work : http://boutiquenumerique.com/test/iframe_feed.html cela m'a fait le sourire :-) (apologies for bad french) M ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Restricting Postgres
Matt Clark wrote: Pierre-Frédéric Caillaud wrote: check this marvelus piece of 5 minutes of work : http://boutiquenumerique.com/test/iframe_feed.html cela m'a fait le sourire :-) (apologies for bad french) M Javascript is not an option for the scripts, one of the mandates of the project is to support as many different client setups as possible and we have encountered everything from WebTV to the latest Firefox release. It's a chat/roleplay community and not everyone will invest in new equipment. Now, it would seem to me that there is a trade off between a JS push system and a constant ever-present process.With the traditional method as I use it, a client will incur the initial penalty of going through authentication, pulling the look and feel of the realms, sites and simply poll one table from that point on. Now on the other hand, you have one user making a call for new posts every x amount of seconds. This means every X seconds the penalty for authentication and design would kick in, increasing overall the load. The current scripts can also by dynamically adapted to slow things down based on heavy load or quiet realms that bring little posts in. It's much harder to expect Javascript solutions to work perfectly every time and not be modified by some proxy. Unfortunately, we are getting way off track. I'm looking for a way to protect the PostgreSQL server, either from PostgreSQL or some sort of external script which pools load average once in a while to make that determination. Now is there an administrative command in PostgreSQL that will cause it to move into some sort of maintenance mode? For me that could be exceedingly useful as it would still allow for an admin connection to be made and run a VACUUM FULL and such. Martin Foster Creator/Designer Ethereal Realms [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Restricting Postgres
On Thu, Nov 04, 2004 at 23:32:57 +, Matt Clark [EMAIL PROTECTED] wrote: I think in the future there will be a good bit of presentation login in the client... Not if Bruno has his way ;-) Sure there will, but it will be controlled by the client, perhaps taking suggestions from the style sheet pointed to by the document. Running foreign code from random or even semi-random places is a recipe for becoming a spam server. See examples from Microsoft such as their spreadsheet and office software. Documents really need to be passive data, not active code. If the client and the server have a special trust relationship, then running code supplied by the server makes sense. So you might use javascript within a business where the IT department runs the server and the employees run clients. However, encouraging people to browse the internet with javascript enabled is a bad idea. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Restricting Postgres
On Tue, Nov 02, 2004 at 11:52:12PM +, Martin Foster wrote: Is there a way to restrict how much load a PostgreSQL server can take before dropping queries in order to safeguard the server?I was Well, you could limit the number of concurrent connections, and set the query timeout to a relatively low level. What that ought to mean is that, under heavy load, some queries will abort. A -- Andrew Sullivan | [EMAIL PROTECTED] When my information changes, I alter my conclusions. What do you do sir? --attr. John Maynard Keynes ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] Restricting Postgres
On Tue, 2004-11-02 at 23:52, Martin Foster wrote: Is there a way to restrict how much load a PostgreSQL server can take before dropping queries in order to safeguard the server?I was looking at the login.conf (5) man page and while it allows me to limit by processor time this seems to not fit my specific needs. Essentially, I am looking for a sort of functionality similar to what Sendmail and Apache have. Once the load of the system reaches a certain defined limit the daemon drops tasks until such a time that it can resume normal operation. Sounds great... could you give more shape to the idea, so people can comment on it? What limit? Measured how? Normal operation is what? Drop what? How to tell? While not necessarily common on my servers I have witnessed some fairly high load averages which may have led to the machine dropping outright. Any help on this matter would be appreciated. You can limit the number of connections overall? -- Best Regards, Simon Riggs ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Restricting Postgres
Simon Riggs wrote: On Tue, 2004-11-02 at 23:52, Martin Foster wrote: Is there a way to restrict how much load a PostgreSQL server can take before dropping queries in order to safeguard the server?I was looking at the login.conf (5) man page and while it allows me to limit by processor time this seems to not fit my specific needs. Essentially, I am looking for a sort of functionality similar to what Sendmail and Apache have. Once the load of the system reaches a certain defined limit the daemon drops tasks until such a time that it can resume normal operation. Sounds great... could you give more shape to the idea, so people can comment on it? What limit? Measured how? Normal operation is what? Drop what? How to tell? Let's use the example in Apache, there is the Apache::LoadAvgLimit mod_perl module which allows one to limit based on the system load averages. Here is an example of the configuration one would find: Location /perl PerlInitHandler Apache::LoadAvgLimit PerlSetVar LoadAvgLimit_1 3.00 PerlSetVar LoadAvgLimit_5 2.00 PerlSetVar LoadAvgLimit_15 1.50 PerlSetVar LoadAvgRetryAfter 120 /Location The end state is simple, once the load average moves above 3.00 for the 1 minute average the web server will not process the CGI scripts or mod_perl applications under that directory. Instead it will return a 503 error and save the system from being crushed by ever increasing load averages. Only once the load average is below the defined limits will the server process requests as normal. This is not necessarily the nicest or cleanest way or doing things, but it does allow the Apache web server to prevent a collapse. There are ways of restricting the size of files, number of concurrent processes and even memory being used by a daemon. This can be done through ulimit or the login.conf file if your system supports it. However, there is no way to restrict based on load averages, only processor time which is ineffective for a perpetually running daemon like PostgreSQL has. While not necessarily common on my servers I have witnessed some fairly high load averages which may have led to the machine dropping outright. Any help on this matter would be appreciated. You can limit the number of connections overall? Limiting concurrent connections is not always the solution to the problem. Problems can occur when there is a major spike in activity that would be considered abnormal, due to outside conditions. For example using Apache::DBI or pgpool the DBMS may be required to spawn a great deal of child processed in a short order of time. This in turn can cause a major spike in processor load and if unchecked by running as high demand queries the system can literally increase in load until the server buckles. I've seen this behavior before when restarting the web server during heavy loads.Apache goes from zero connections to a solid 120, causing PostgreSQL to spawn that many children in a short order of time just to keep up with the demand. PostgreSQL undertakes a penalty when spawning a new client and accepting a connection, this slows takes resources at every level to accomplish. However clients on the web server are hitting the server at an accelerated rate because of the slowed response, leading to even more demand being placed on both machines. In most cases the processor will be taxed and the load average high enough to cause even a noticeable delay when using a console, however it will generally recover... slowly or in rare cases crash outright. In such a circumstance, having the database server refuse queries when the sanity of the system is concerned might come in handy for such a circumstance. Of course, I am not blaming PostgreSQL, there are probably some instabilities in the AMD64 port of FreeBSD 5.2.1 for dual processor systems that lead to an increased chance of failure instead of recovery. However, if there was a way to prevent the process from reaching those limits, it may avoid the problem altogether. Martin Foster Creator/Designer Ethereal Realms [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] Restricting Postgres
Martin Foster wrote: Simon Riggs wrote: On Tue, 2004-11-02 at 23:52, Martin Foster wrote: [...] I've seen this behavior before when restarting the web server during heavy loads.Apache goes from zero connections to a solid 120, causing PostgreSQL to spawn that many children in a short order of time just to keep up with the demand. But wouldn't limiting the number of concurrent connections do this at the source. If you tell it that You can at most have 20 connections you would never have postgres spawn 120 children. I'm not sure what apache does if it can't get a DB connection, but it seems exactly like what you want. Now, if you expected to have 50 clients that all like to just sit on open connections, you could leave the number of concurrent connections high. But if your only connect is from the webserver, where all of them are designed to be short connections, then leave the max low. The other possibility is having the webserver use connection pooling, so it uses a few long lived connections. But even then, you could limit it to something like 10-20, not 120. John =:- signature.asc Description: OpenPGP digital signature
Re: [PERFORM] Restricting Postgres
John A Meinel wrote: Martin Foster wrote: Simon Riggs wrote: On Tue, 2004-11-02 at 23:52, Martin Foster wrote: [...] I've seen this behavior before when restarting the web server during heavy loads.Apache goes from zero connections to a solid 120, causing PostgreSQL to spawn that many children in a short order of time just to keep up with the demand. But wouldn't limiting the number of concurrent connections do this at the source. If you tell it that You can at most have 20 connections you would never have postgres spawn 120 children. I'm not sure what apache does if it can't get a DB connection, but it seems exactly like what you want. Now, if you expected to have 50 clients that all like to just sit on open connections, you could leave the number of concurrent connections high. But if your only connect is from the webserver, where all of them are designed to be short connections, then leave the max low. The other possibility is having the webserver use connection pooling, so it uses a few long lived connections. But even then, you could limit it to something like 10-20, not 120. John =:- I have a dual processor system that can support over 150 concurrent connections handling normal traffic and load. Now suppose I setup Apache to spawn all of it's children instantly, what will happen is that as this happens the PostgreSQL server will also receive 150 attempts at connection. This will spawn 150 children in a short order of time and as this takes place clients can connect and start requesting information not allowing the machine to settle down to a normal traffic.That spike when initiated can cripple the machine or even the webserver if a deadlocked transaction is introduced. Because on the webserver side a slowdown in the database means that it will just get that many more connection attempts pooled from the clients. As they keep clicking and hitting reload over and over to get a page load, that server starts to buckle hitting unbelievably high load averages. When the above happened once, I lost the ability to type on a console because of a 60+ (OpenBSD) load average on a single processor system. The reason why Apache now drops a 503 Service Unavailable when loads get too high. It's that spike I worry about and it can happen for whatever reason. It could just as easily be triggered by a massive concurrent request for processing of an expensive query done in DDOS fashion. This may not affect the webserver at all, at least immediately, but the same problem can effect can come into effect. Limiting connections help, but it's not the silver bullet and limits your ability to support more connections because of that initial spike. The penalty for forking a new child is hardly unexecpected, even Apache will show the same effect when restarted in a high traffic time. Martin Foster Creator/Designer Ethereal Realms [EMAIL PROTECTED] ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match