[PERFORM] 100 simultaneous connections, critical limit?
I am writing a website that will probably have some traffic. Right now I wrap every .php page in pg_connect() and pg_close(). Then I read somewhere that Postgres only supports 100 simultaneous connections (default). Is that a limitation? Should I use some other method when writing code for high-traffic website? J. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] 100 simultaneous connections, critical limit?
On Wednesday 14 January 2004 18:18, Jón Ragnarsson wrote: I am writing a website that will probably have some traffic. Right now I wrap every .php page in pg_connect() and pg_close(). Then I read somewhere that Postgres only supports 100 simultaneous connections (default). Is that a limitation? Should I use some other method when writing code for high-traffic website? Yes. You should rather investigate connection pooling. I am no php expert but probably this could help you.. http://www.php.net/manual/en/function.pg-pconnect.php Shridhar ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] 100 simultaneous connections, critical limit?
Clinging to sanity, [EMAIL PROTECTED] (Jón Ragnarsson) mumbled into her beard: I am writing a website that will probably have some traffic. Right now I wrap every .php page in pg_connect() and pg_close(). Then I read somewhere that Postgres only supports 100 simultaneous connections (default). Is that a limitation? Should I use some other method when writing code for high-traffic website? I thought the out-of-the-box default was 32. If you honestly need a LOT of connections, you can configure the database to support more. I upped the limit on one system to have 512 the other week; certainly supportable, if you have the RAM for it. It is, however, quite likely that the connect()/close() cuts down on the efficiency of your application. If PHP supports some form of connection pooling, you should consider using that, as it will cut down _dramatically_ on the amount of work done establishing/closing connections, and should let your apps use somewhat fewer connections more effectively. -- (format nil [EMAIL PROTECTED] cbbrowne acm.org) http://cbbrowne.com/info/linux.html It has been said that man is a rational animal. All my life I have been searching for evidence which could support this. -- Bertrand Russell ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] 100 simultaneous connections, critical limit?
Ok, connection pooling was the thing that I thought of first, but I haven't found any docs regarding pooling with PHP+Postgres. OTOH, I designed the application to be as independent from the DB as possible. (No stored procedures or other Postgres specific stuff) Thanks, J. Christopher Browne wrote: Clinging to sanity, [EMAIL PROTECTED] (Jón Ragnarsson) mumbled into her beard: I am writing a website that will probably have some traffic. Right now I wrap every .php page in pg_connect() and pg_close(). Then I read somewhere that Postgres only supports 100 simultaneous connections (default). Is that a limitation? Should I use some other method when writing code for high-traffic website? I thought the out-of-the-box default was 32. If you honestly need a LOT of connections, you can configure the database to support more. I upped the limit on one system to have 512 the other week; certainly supportable, if you have the RAM for it. It is, however, quite likely that the connect()/close() cuts down on the efficiency of your application. If PHP supports some form of connection pooling, you should consider using that, as it will cut down _dramatically_ on the amount of work done establishing/closing connections, and should let your apps use somewhat fewer connections more effectively. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] 100 simultaneous connections, critical limit?
-Original Message- From: [EMAIL PROTECTED] [mailto:pgsql-performance- [EMAIL PROTECTED] On Behalf Of Jón Ragnarsson Sent: 14 January 2004 13:44 Cc: [EMAIL PROTECTED] Subject: Re: [PERFORM] 100 simultaneous connections, critical limit? Ok, connection pooling was the thing that I thought of first, but I haven't found any docs regarding pooling with PHP+Postgres. OTOH, I designed the application to be as independent from the DB as possible. (No stored procedures or other Postgres specific stuff) Thanks, J. As far as I know PHP supports persistent connections to a PG database. See pg_pconnect instead of pg_connect. Each of the db connections are tied to a particular Apache process and will stay open for the life of that process. So basically make sure your Apache config file (httpd.conf) and PG config file (postgresql.conf) agree on the maximum number of connections otherwise some pages will not be able to connect to your database. This may not be a problem for small sites but on large sites it is, with heavy loads and large number of concurrent users. For example, consider a site that must support 500 concurrent connections. If persistent connections are used at least 500 concurrent connections to PG would be required, which I guess is probably not recommended. The way I would like Apache/PHP to work is to have a global pool of connections to a postgres server, which can be shared around all Apache processes. This pool can be limited to say 50 or 100 connections. Problems occur under peak load where all 500 concurrent connections are in use, but all that should happen is there is a bit of a delay. Hope that (almost) makes sense, Kind Regards, Nick Barr WebBased Ltd. Christopher Browne wrote: Clinging to sanity, [EMAIL PROTECTED] (Jón Ragnarsson) mumbled into her beard: I am writing a website that will probably have some traffic. Right now I wrap every .php page in pg_connect() and pg_close(). Then I read somewhere that Postgres only supports 100 simultaneous connections (default). Is that a limitation? Should I use some other method when writing code for high-traffic website? I thought the out-of-the-box default was 32. If you honestly need a LOT of connections, you can configure the database to support more. I upped the limit on one system to have 512 the other week; certainly supportable, if you have the RAM for it. It is, however, quite likely that the connect()/close() cuts down on the efficiency of your application. If PHP supports some form of connection pooling, you should consider using that, as it will cut down _dramatically_ on the amount of work done establishing/closing connections, and should let your apps use somewhat fewer connections more effectively. ---(end of broadcast)--- TIP 8: explain analyze is your friend ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] 100 simultaneous connections, critical limit?
On Wed, 14 Jan 2004, Jón Ragnarsson wrote: I am writing a website that will probably have some traffic. Right now I wrap every .php page in pg_connect() and pg_close(). Then I read somewhere that Postgres only supports 100 simultaneous connections (default). Is that a limitation? Should I use some other method when writing code for high-traffic website? A few tips from an old PHP/Apache/Postgresql developer. 1: Avoid pg_pconnect unless you are certain you have load tested the system and it will behave properly. pg_pconnect often creates as many issues as it solves. 2: While php has pretty mediocre run time performance, it's startup / shutdown / cleanup are quite fast, and it caches previously executed pages. Thus, if your pages are relatively small, code-wise, then the amount of time it will take to execute them, versus the amount of time the user will spend reading the output will be quite small. So, you can likely handle many hundreds of users before hitting any limit on the database end. 3: Apache can only run so many children too. The default for the 1.3 branch is 150. If you decrease that to 50 or so, you are quite unlikely to ever run out of connections to the database. 4: Postgresql can handle thousands of connections if the server and postgresql itself are properly configured, so don't worry so much about that. You can always increase the max should you need to later. 5: Database connection time in a php script is generally a non-issue. pg_connect on a fast machine, hitting a local pgsql database generally runs in about 1/10,000th of a second. Persistant connects get this down to about 1/1,000,000th of a second. Either way, a typical script takes milliseconds to run, i.e. 1/100th of a second or longer, so the actual difference between a pg_pconnect and a pg_connect just isn't worth worrying about in 99% of all circumstances. 6: Profile your user's actions and the time it takes the server versus how long it takes them to make the next click. Even the fastest user is usually much slower than your server, so it takes a whole bunch of them to start bogging the system down. 7: Profile your machine under parallel load. Note that machine simos (i.e. the kind you get from the ab utility) generally represent about 10 to 20 real people. I.e. if your machine runs well with 20 machine simos, you can bet on it handling 100 or more real people with ease. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] COUNT Pagination
scott.marlowe wrote: On Tue, 13 Jan 2004, David Shadovitz wrote: We avert the subsequent execution of count(*) by passing the value of count(*) as a query parameter through the link in page numbers. Mallah, and others who mentioned caching the record count: Yes, I will certainly do this. I can detect whether the query's filter has been changed, or whether the user is merely paging through the results or sorting* the results. I'd love to completely eliminate the cost of the COUNT(*) query, but I guess that I cannot have everything. * My HTML table column headers are hyperlinks which re-execute the query, sorting the results by the selected column. The first click does an ASC sort; a second click does a DESC sort. another useful trick is to have your script save out the count(*) result in a single row table with a timestamp, and every time you grab if, check to see if x number of minutes have passed, and if so, update that row with a count(*). Greetings! The count(*) can get evaluated with any arbitrary combination in whre clause how do you plan to store that information ? In a typical application pagination could be required in n number of contexts . I would be interested to know more about this trick and its applicability in such situations. Offtopic: Does PostgreSQL optimise repeated execution of similar queries ie queries on same table or set of tables (in a join) with same where clause and only differing in LIMIT and OFFSET. I dont know much about MySQL, Is their "Query Cache" achieving better results in such cases? and do we have anything similar in PostgreSQL ? I think the most recently accessed tables anyways get loaded in shared buffers in PostgreSQL so that its not accessed from the disk. But is the "Query Cache" really different from this. Can anyone knowing a little better about the working of MySQLs' query cache throw some light? Regds Mallah. You can even have a cron job do it so your own scripts don't incur the cost of the count(*) and delay output to the user. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] 100 simultaneous connections, critical limit?
Hi! AA scott.marlowe wrote: A few tips from an old PHP/Apache/Postgresql developer. 1: Avoid pg_pconnect unless you are certain you have load tested the system and it will behave properly. pg_pconnect often creates as many issues as it solves. My experience with persistant connections in PHP is quite similar to the one of Scott Marlowe. There are some nasty effects if something is not working. The most harmless results come probably from not closed transactions which will result in a warning as PHP seems to send always a BEGIN; ROLLBACK; for reusing a connection. AA I share the above view. I've had little success with persistent AA connections. The cost of pg_connect is minimal, pg_pconnect is not a AA viable solution IMHO. Connections are rarely actually reused. Still I think it´s a good way to speed things up. Probably the connection time it takes in PHP is not so the gain, but the general saving of processor time. Spawning a new process on the backend can be a very expensive operation. And if it happens often, it sums up. Perhaps it´s only a memory for CPU time deal. My persistant connections get very evenly used, no matter if there are 2 or 10. The CPU usage for them is very equally distributed. Christoph Nelles -- Mit freundlichen Grüssen Evil Azraelmailto:[EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] 100 simultaneous connections, critical limit?
On Wed, 14 Jan 2004, Adam Alkins wrote: scott.marlowe wrote: A few tips from an old PHP/Apache/Postgresql developer. 1: Avoid pg_pconnect unless you are certain you have load tested the system and it will behave properly. pg_pconnect often creates as many issues as it solves. I share the above view. I've had little success with persistent connections. The cost of pg_connect is minimal, pg_pconnect is not a viable solution IMHO. Connections are rarely actually reused. I've found that for best performance with pg_pconnect, you need to restrict the apache server to a small number of backends, say 40 or 50, extend keep alive to 60 or so seconds, and use the same exact connection string all over the place. Also, set max.persistant.connections or whatever it is in php.ini to 1 or 2. Note that max.persistant.connections is PER BACKEND, not total, in php.ini, so 1 or 2 should be enough for most types of apps. 3 tops. Then, setup postgresql for 200 connections, so you'll never run out. Tis better to waste a little shared memory and be safe than it is to get the dreaded out of connections error from postgresql. If you do all of the above, pg_pconnect can work pretty well, on things like dedicated app servers where only one thing is being done and it's being done a lot. On general purpose servers with 60 databases and 120 applications, it adds little, although extending the keep alive timeout helps. but if you just start using pg_pconnect without reconfiguring and then testing, it's quite likely your site will topple over under load with out of connection errors. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] COUNT Pagination
On Wed, 14 Jan 2004, Rajesh Kumar Mallah wrote: scott.marlowe wrote: On Tue, 13 Jan 2004, David Shadovitz wrote: We avert the subsequent execution of count(*) by passing the value of count(*) as a query parameter through the link in page numbers. Mallah, and others who mentioned caching the record count: Yes, I will certainly do this. I can detect whether the query's filter has been changed, or whether the user is merely paging through the results or sorting* the results. I'd love to completely eliminate the cost of the COUNT(*) query, but I guess that I cannot have everything. * My HTML table column headers are hyperlinks which re-execute the query, sorting the results by the selected column. The first click does an ASC sort; a second click does a DESC sort. another useful trick is to have your script save out the count(*) result in a single row table with a timestamp, and every time you grab if, check to see if x number of minutes have passed, and if so, update that row with a count(*). Greetings! The count(*) can get evaluated with any arbitrary combination in whre clause how do you plan to store that information ? In a typical application pagination could be required in n number of contexts . I would be interested to know more about this trick and its applicability in such situations. Offtopic: Does PostgreSQL optimise repeated execution of similar queries ie queries on same table or set of tables (in a join) with same where clause and only differing in LIMIT and OFFSET. Yes, and no. Yes, previously run query should be faster, if it fits in kernel cache. No, Postgresql doesn't cache any previous results or plans (unless you use prepare / execute, then it only caches the plan, not the query results). Plus, the design of Postgresql is such that it would have to do a LOT of cache checking to see if there were any updates to the underlying data between selects. Since such queries are unlikely to be repeated inside a transaction, the only place where you wouldn't have to check for new tuples, it's not really worth trying to implement. Keep in mind most databases can use an index on max(*) because each aggregate is programmed by hand to do one thing. In Postgresql, you can create your own aggregate, and since there's no simple way to make aggregates use indexes in the general sense, it's not likely to get optimized. I.e. any optimization for JUST max(*)/min(*) is unlikely unless it can be used for the other aggregates. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] 100 simultaneous connections, critical limit?
7: Profile your machine under parallel load. Note that machine simos (i.e. the kind you get from the ab utility) generally represent about 10 to 20 real people. I.e. if your machine runs well with 20 machine simos, you can bet on it handling 100 or more real people with ease. 8. Use the Turck MMCache - it rocks. Works absolutely perfectly and caches compiled versions of all your PHP scripts - cut the load on our server by a factor of 5. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] 100 simultaneous connections, critical limit?
On Thu, 2004-01-15 at 01:48, Jn Ragnarsson wrote: I am writing a website that will probably have some traffic. Right now I wrap every .php page in pg_connect() and pg_close(). Then I read somewhere that Postgres only supports 100 simultaneous connections (default). Is that a limitation? Should I use some other method when writing code for high-traffic website? Whether the overhead of pg_connect() pg_close() has a noticeable effect on your application depends on what you do in between them. TBH I never do that second one myself - PHP will close the connection when the page is finished. I have developed some applications which are trying to be as-fast-as-possible and for which I either use pg_pconnect so you have one DB connection per Apache process, or I use DBBalancer where you have a pool of connections, and pg_connect is _actually_ connecting to DBBalancer in a very low-overhead manner and you have a pool of connections out the back. I am the Debian package maintainer for DBBalancer. You may also want to consider differentiating based on whether the application is writing to the database or not. Pooling and persistent connections can give weird side-effects if transaction scoping is bollixed in the application - a second page view re-using an earlier connection which was serving a different page could find itself in the middle of an unexpected transaction. Temp tables are one thing that can bite you here. There are a few database pooling solutions out there. Using pg_pconnect is the simplest of these, DBBalancer fixes some of it's issues, and others go further still. Another point to consider is that database pooling will give you the biggest performance increase if your queries are all returning small datasets. If you return large datasets it can potentially make things worse (depending on implementation) through double-handling of the data. As others have said too: 100 is just a configuration setting in postgresql.conf - not an implemented limit. Cheers, Andrew McMillan. - Andrew @ Catalyst .Net .NZ Ltd, PO Box 11-053, Manners St, Wellington WEB: http://catalyst.net.nz/ PHYS: Level 2, 150-154 Willis St DDI: +64(4)916-7201 MOB: +64(21)635-694 OFFICE: +64(4)499-2267 How many things I can do without! - Socrates - ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]