On Tue, 2004-11-02 at 23:52, Martin Foster wrote:
Is there a way to restrict how much load a PostgreSQL server can take before dropping queries in order to safeguard the server? I was looking at the login.conf (5) man page and while it allows me to limit by processor time this seems to not fit my specific needs.
Essentially, I am looking for a sort of functionality similar to what Sendmail and Apache have. Once the load of the system reaches a certain defined limit the daemon drops tasks until such a time that it can resume normal operation.
Sounds great... could you give more shape to the idea, so people can comment on it?
What limit? Measured how? Normal operation is what?
Drop what? How to tell?
Let's use the example in Apache, there is the Apache::LoadAvgLimit mod_perl module which allows one to limit based on the system load averages. Here is an example of the configuration one would find:
<Location /perl> PerlInitHandler Apache::LoadAvgLimit PerlSetVar LoadAvgLimit_1 3.00 PerlSetVar LoadAvgLimit_5 2.00 PerlSetVar LoadAvgLimit_15 1.50 PerlSetVar LoadAvgRetryAfter 120 </Location>
The end state is simple, once the load average moves above 3.00 for the 1 minute average the web server will not process the CGI scripts or mod_perl applications under that directory. Instead it will return a 503 error and save the system from being crushed by ever increasing load averages.
Only once the load average is below the defined limits will the server process requests as normal. This is not necessarily the nicest or cleanest way or doing things, but it does allow the Apache web server to prevent a collapse.
There are ways of restricting the size of files, number of concurrent processes and even memory being used by a daemon. This can be done through ulimit or the login.conf file if your system supports it. However, there is no way to restrict based on load averages, only processor time which is ineffective for a perpetually running daemon like PostgreSQL has.
While not necessarily common on my servers I have witnessed some fairly high load averages which may have led to the machine dropping outright. Any help on this matter would be appreciated.
You can limit the number of connections overall?
Limiting concurrent connections is not always the solution to the problem. Problems can occur when there is a major spike in activity that would be considered abnormal, due to outside conditions.
For example using Apache::DBI or pgpool the DBMS may be required to spawn a great deal of child processed in a short order of time. This in turn can cause a major spike in processor load and if unchecked by running as high demand queries the system can literally increase in load until the server buckles.
I've seen this behavior before when restarting the web server during heavy loads. Apache goes from zero connections to a solid 120, causing PostgreSQL to spawn that many children in a short order of time just to keep up with the demand.
PostgreSQL undertakes a penalty when spawning a new client and accepting a connection, this slows takes resources at every level to accomplish. However clients on the web server are hitting the server at an accelerated rate because of the slowed response, leading to even more demand being placed on both machines.
In most cases the processor will be taxed and the load average high enough to cause even a noticeable delay when using a console, however it will generally recover... slowly or in rare cases crash outright. In such a circumstance, having the database server refuse queries when the sanity of the system is concerned might come in handy for such a circumstance.
Of course, I am not blaming PostgreSQL, there are probably some instabilities in the AMD64 port of FreeBSD 5.2.1 for dual processor systems that lead to an increased chance of failure instead of recovery. However, if there was a way to prevent the process from reaching those limits, it may avoid the problem altogether.
Martin Foster Creator/Designer Ethereal Realms [EMAIL PROTECTED]
---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ?