[PERFORM] 100 simultaneous connections, critical limit?

2004-01-14 Thread Jón Ragnarsson
I am writing a website that will probably have some traffic.
Right now I wrap every .php page in pg_connect() and pg_close().
Then I read somewhere that Postgres only supports 100 simultaneous
connections (default). Is that a limitation? Should I use some other
method when writing code for high-traffic website?
J.


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] 100 simultaneous connections, critical limit?

2004-01-14 Thread Shridhar Daithankar
On Wednesday 14 January 2004 18:18, Jón Ragnarsson wrote:
 I am writing a website that will probably have some traffic.
 Right now I wrap every .php page in pg_connect() and pg_close().
 Then I read somewhere that Postgres only supports 100 simultaneous
 connections (default). Is that a limitation? Should I use some other
 method when writing code for high-traffic website?

Yes. You should rather investigate connection pooling.

I am no php expert but probably this could help you..

http://www.php.net/manual/en/function.pg-pconnect.php

 Shridhar



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] 100 simultaneous connections, critical limit?

2004-01-14 Thread Christopher Browne
Clinging to sanity, [EMAIL PROTECTED] (Jón Ragnarsson) mumbled into her beard:
 I am writing a website that will probably have some traffic.
 Right now I wrap every .php page in pg_connect() and pg_close().
 Then I read somewhere that Postgres only supports 100 simultaneous
 connections (default). Is that a limitation? Should I use some other
 method when writing code for high-traffic website?

I thought the out-of-the-box default was 32.

If you honestly need a LOT of connections, you can configure the
database to support more.  I upped the limit on one system to have
512 the other week; certainly supportable, if you have the RAM for it.

It is, however, quite likely that the connect()/close() cuts down on
the efficiency of your application.  If PHP supports some form of
connection pooling, you should consider using that, as it will cut
down _dramatically_ on the amount of work done establishing/closing
connections, and should let your apps use somewhat fewer connections
more effectively.
-- 
(format nil [EMAIL PROTECTED] cbbrowne acm.org)
http://cbbrowne.com/info/linux.html
It has been said  that man is a rational animal.  All  my life I have
been searching for evidence which could support this.
-- Bertrand Russell

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] 100 simultaneous connections, critical limit?

2004-01-14 Thread Jón Ragnarsson
Ok, connection pooling was the thing that I thought of first, but I 
haven't found any docs regarding pooling with PHP+Postgres.
OTOH, I designed the application to be as independent from the DB as 
possible. (No stored procedures or other Postgres specific stuff)
Thanks,
J.

Christopher Browne wrote:

Clinging to sanity, [EMAIL PROTECTED] (Jón Ragnarsson) mumbled into her beard:

I am writing a website that will probably have some traffic.
Right now I wrap every .php page in pg_connect() and pg_close().
Then I read somewhere that Postgres only supports 100 simultaneous
connections (default). Is that a limitation? Should I use some other
method when writing code for high-traffic website?


I thought the out-of-the-box default was 32.

If you honestly need a LOT of connections, you can configure the
database to support more.  I upped the limit on one system to have
512 the other week; certainly supportable, if you have the RAM for it.
It is, however, quite likely that the connect()/close() cuts down on
the efficiency of your application.  If PHP supports some form of
connection pooling, you should consider using that, as it will cut
down _dramatically_ on the amount of work done establishing/closing
connections, and should let your apps use somewhat fewer connections
more effectively.


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] 100 simultaneous connections, critical limit?

2004-01-14 Thread Nick Barr
 -Original Message-
 From: [EMAIL PROTECTED]
[mailto:pgsql-performance-
 [EMAIL PROTECTED] On Behalf Of Jón Ragnarsson
 Sent: 14 January 2004 13:44
 Cc: [EMAIL PROTECTED]
 Subject: Re: [PERFORM] 100 simultaneous connections, critical limit?
 
 Ok, connection pooling was the thing that I thought of first, but I
 haven't found any docs regarding pooling with PHP+Postgres.
 OTOH, I designed the application to be as independent from the DB as
 possible. (No stored procedures or other Postgres specific stuff)
 Thanks,
 J.

As far as I know PHP supports persistent connections to a PG database.
See pg_pconnect instead of pg_connect. Each of the db connections are
tied to a particular Apache process and will stay open for the life of
that process. So basically make sure your Apache config file
(httpd.conf) and PG config file (postgresql.conf) agree on the maximum
number of connections otherwise some pages will not be able to connect
to your database.

This may not be a problem for small sites but on large sites it is, with
heavy loads and large number of concurrent users. For example, consider
a site that must support 500 concurrent connections. If persistent
connections are used at least 500 concurrent connections to PG would be
required, which I guess is probably not recommended.

The way I would like Apache/PHP to work is to have a global pool of
connections to a postgres server, which can be shared around all Apache
processes. This pool can be limited to say 50 or 100 connections.
Problems occur under peak load where all 500 concurrent connections are
in use, but all that should happen is there is a bit of a delay.

Hope that (almost) makes sense,


Kind Regards,

Nick Barr
WebBased Ltd.


 Christopher Browne wrote:
 
  Clinging to sanity, [EMAIL PROTECTED] (Jón Ragnarsson) mumbled
into
 her beard:
 
 I am writing a website that will probably have some traffic.
 Right now I wrap every .php page in pg_connect() and pg_close().
 Then I read somewhere that Postgres only supports 100 simultaneous
 connections (default). Is that a limitation? Should I use some other
 method when writing code for high-traffic website?
 
 
  I thought the out-of-the-box default was 32.
 
  If you honestly need a LOT of connections, you can configure the
  database to support more.  I upped the limit on one system to have
  512 the other week; certainly supportable, if you have the RAM for
it.
 
  It is, however, quite likely that the connect()/close() cuts down on
  the efficiency of your application.  If PHP supports some form of
  connection pooling, you should consider using that, as it will cut
  down _dramatically_ on the amount of work done establishing/closing
  connections, and should let your apps use somewhat fewer connections
  more effectively.
 
 
 ---(end of
broadcast)---
 TIP 8: explain analyze is your friend



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] 100 simultaneous connections, critical limit?

2004-01-14 Thread scott.marlowe
On Wed, 14 Jan 2004, Jón Ragnarsson wrote:

 I am writing a website that will probably have some traffic.
 Right now I wrap every .php page in pg_connect() and pg_close().
 Then I read somewhere that Postgres only supports 100 simultaneous
 connections (default). Is that a limitation? Should I use some other
 method when writing code for high-traffic website?

A few tips from an old PHP/Apache/Postgresql developer.

1: Avoid pg_pconnect unless you are certain you have load tested the 
system and it will behave properly.  pg_pconnect often creates as many 
issues as it solves.

2: While php has pretty mediocre run time performance, it's startup / 
shutdown / cleanup are quite fast, and it caches previously executed 
pages.  Thus, if your pages are relatively small, code-wise, then the 
amount of time it will take to execute them, versus the amount of time the 
user will spend reading the output will be quite small.  So, you can 
likely handle many hundreds of users before hitting any limit on the 
database end.

3: Apache can only run so many children too.  The default for the 1.3 
branch is 150.  If you decrease that to 50 or so, you are quite unlikely 
to ever run out of connections to the database.

4: Postgresql can handle thousands of connections if the server and 
postgresql itself are properly configured, so don't worry so much about 
that.  You can always increase the max should you need to later.

5: Database connection time in a php script is generally a non-issue.  
pg_connect on a fast machine, hitting a local pgsql database generally 
runs in about 1/10,000th of a second.  Persistant connects get this down 
to about 1/1,000,000th of a second.  Either way, a typical script takes 
milliseconds to run, i.e. 1/100th of a second or longer, so the actual 
difference between a pg_pconnect and a pg_connect just isn't worth 
worrying about in 99% of all circumstances.

6: Profile your user's actions and the time it takes the server versus how 
long it takes them to make the next click.  Even the fastest user is 
usually much slower than your server, so it takes a whole bunch of them to 
start bogging the system down.  

7: Profile your machine under parallel load.  Note that machine simos 
(i.e. the kind you get from the ab utility) generally represent about 10 
to 20 real people.  I.e. if your machine runs well with 20 machine simos, 
you can bet on it handling 100 or more real people with ease.


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] COUNT Pagination

2004-01-14 Thread Rajesh Kumar Mallah




scott.marlowe wrote:

  On Tue, 13 Jan 2004, David Shadovitz wrote:

  
  

  We avert the subsequent execution of count(*) by passing the
value of count(*) as a query parameter through the link in page
numbers.
  

Mallah, and others who mentioned caching the record count:

Yes, I will certainly do this.  I can detect whether the query's filter has 
been changed, or whether the user is merely paging through the results or 
sorting* the results.

I'd love to completely eliminate the cost of the COUNT(*) query, but I guess 
that I cannot have everything.

* My HTML table column headers are hyperlinks which re-execute the query, 
sorting the results by the selected column.  The first click does an ASC 
sort; a second click does a DESC sort.

  
  
another useful trick is to have your script save out the count(*) result 
in a single row table with a timestamp, and every time you grab if, check 
to see if x number of minutes have passed, and if so, update that row with 
a count(*). 


Greetings!

The count(*) can get evaluated with any arbitrary combination 
in whre clause how do you plan to store that information ?

In a typical application pagination could be required in n number
of contexts . I would be interested to know more about this trick 
and its applicability in such situations.

Offtopic:

Does PostgreSQL optimise repeated execution of similar queries ie
queries on same table or set of tables (in a join) with same where
clause
and only differing in LIMIT and OFFSET.

I dont know much about MySQL, Is their "Query Cache" achieving
better results in such cases? and do we have anything similar in 
PostgreSQL ? I think the most recently accessed tables anyways 
get loaded in shared buffers in PostgreSQL so that its not accessed
from the disk. But is the "Query Cache" really different from this.
Can anyone knowing a little better about the working of MySQLs' 
query cache throw some light?

Regds
Mallah.


   You can even have a cron job do it so your own scripts don't 
incur the cost of the count(*) and delay output to the user.



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

  






Re: [PERFORM] 100 simultaneous connections, critical limit?

2004-01-14 Thread Evil Azrael
Hi!




AA scott.marlowe wrote:

A few tips from an old PHP/Apache/Postgresql developer.

1: Avoid pg_pconnect unless you are certain you have load tested the
system and it will behave properly.  pg_pconnect often creates as many
issues as it solves.
  



My experience with persistant connections in PHP is quite similar to
the one of Scott Marlowe. There are some nasty effects if something
is not working. The most harmless results come probably from not closed
transactions which will result in a warning as PHP seems to send
always a BEGIN; ROLLBACK; for reusing a connection.



AA I share the above view. I've had little success with persistent 
AA connections. The cost of pg_connect is minimal, pg_pconnect is not a
AA viable solution IMHO. Connections are rarely actually reused.


Still I think it´s a good way to speed things up. Probably the
connection time it takes in PHP is not so the gain, but the general
saving of processor time. Spawning a new process on the backend can be
a very expensive operation. And if it happens often, it sums up.
Perhaps it´s only a memory for CPU time deal.

My persistant connections get very evenly used, no matter if there are
2 or 10. The CPU usage for them is very equally distributed.



Christoph Nelles


-- 
Mit freundlichen Grüssen
Evil Azraelmailto:[EMAIL PROTECTED]


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] 100 simultaneous connections, critical limit?

2004-01-14 Thread scott.marlowe
On Wed, 14 Jan 2004, Adam Alkins wrote:

 scott.marlowe wrote:
 
 A few tips from an old PHP/Apache/Postgresql developer.
 
 1: Avoid pg_pconnect unless you are certain you have load tested the 
 system and it will behave properly.  pg_pconnect often creates as many 
 issues as it solves.
   
 
 
 I share the above view. I've had little success with persistent 
 connections. The cost of pg_connect is minimal, pg_pconnect is not a 
 viable solution IMHO. Connections are rarely actually reused.

I've found that for best performance with pg_pconnect, you need to 
restrict the apache server to a small number of backends, say 40 or 50, 
extend keep alive to 60 or so seconds, and use the same exact connection 
string all over the place.  Also, set max.persistant.connections or 
whatever it is in php.ini to 1 or 2.  Note that max.persistant.connections 
is PER BACKEND, not total, in php.ini, so 1 or 2 should be enough for most 
types of apps.  3 tops.  Then, setup postgresql for 200 connections, so 
you'll never run out.  Tis better to waste a little shared memory and be 
safe than it is to get the dreaded out of connections error from 
postgresql.

If you do all of the above, pg_pconnect can work pretty well, on things 
like dedicated app servers where only one thing is being done and it's 
being done a lot.  On general purpose servers with 60 databases and 120 
applications, it adds little, although extending the keep alive timeout 
helps.  

but if you just start using pg_pconnect without reconfiguring and then 
testing, it's quite likely your site will topple over under load with out 
of connection errors.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] COUNT Pagination

2004-01-14 Thread scott.marlowe
On Wed, 14 Jan 2004, Rajesh Kumar Mallah wrote:

 scott.marlowe wrote:
 
 On Tue, 13 Jan 2004, David Shadovitz wrote:
 
   
 
 We avert the subsequent execution of count(*) by passing the
 value of count(*) as a query parameter through the link in page
 numbers.
   
 
 Mallah, and others who mentioned caching the record count:
 
 Yes, I will certainly do this.  I can detect whether the query's filter has 
 been changed, or whether the user is merely paging through the results or 
 sorting* the results.
 
 I'd love to completely eliminate the cost of the COUNT(*) query, but I guess 
 that I cannot have everything.
 
 * My HTML table column headers are hyperlinks which re-execute the query, 
 sorting the results by the selected column.  The first click does an ASC 
 sort; a second click does a DESC sort.
 
 
 
 another useful trick is to have your script save out the count(*) result 
 in a single row table with a timestamp, and every time you grab if, check 
 to see if x number of minutes have passed, and if so, update that row with 
 a count(*). 
 
 
 Greetings!
 
 The count(*) can get evaluated with any arbitrary combination
 in whre clause how do you plan to store that information ?
 
 In a typical application pagination could be required in n number
 of contexts . I would be interested to know more about this trick
 and its applicability in such situations.
 
 Offtopic:
 
 Does PostgreSQL optimise repeated execution of similar queries ie
 queries on same table or set of tables (in a join) with same where clause
  and only differing in LIMIT and OFFSET.

Yes, and no.

Yes, previously run query should be faster, if it fits in kernel cache.  

No, Postgresql doesn't cache any previous results or plans (unless you use 
prepare / execute, then it only caches the plan, not the query results).

Plus, the design of Postgresql is such that it would have to do a LOT of 
cache checking to see if there were any updates to the underlying data 
between selects.  Since such queries are unlikely to be repeated inside a 
transaction, the only place where you wouldn't have to check for new 
tuples, it's not really worth trying to implement.

Keep in mind most databases can use an index on max(*) because each 
aggregate is programmed by hand to do one thing.  In Postgresql, you can 
create your own aggregate, and since there's no simple way to make 
aggregates use indexes in the general sense, it's not likely to get 
optimized.  I.e. any optimization for JUST max(*)/min(*) is unlikely 
unless it can be used for the other aggregates.



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] 100 simultaneous connections, critical limit?

2004-01-14 Thread Christopher Kings-Lynne
7: Profile your machine under parallel load.  Note that machine simos 
(i.e. the kind you get from the ab utility) generally represent about 10 
to 20 real people.  I.e. if your machine runs well with 20 machine simos, 
you can bet on it handling 100 or more real people with ease.
8.  Use the Turck MMCache - it rocks.  Works absolutely perfectly and 
caches compiled versions of all your PHP scripts - cut the load on our 
server by a factor of 5.

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] 100 simultaneous connections, critical limit?

2004-01-14 Thread Andrew McMillan
On Thu, 2004-01-15 at 01:48, Jn Ragnarsson wrote:
 I am writing a website that will probably have some traffic.
 Right now I wrap every .php page in pg_connect() and pg_close().
 Then I read somewhere that Postgres only supports 100 simultaneous
 connections (default). Is that a limitation? Should I use some other
 method when writing code for high-traffic website?

Whether the overhead of pg_connect() pg_close() has a noticeable effect
on your application depends on what you do in between them.  TBH I never
do that second one myself - PHP will close the connection when the page
is finished.

I have developed some applications which are trying to be
as-fast-as-possible and for which I either use pg_pconnect so you have
one DB connection per Apache process, or I use DBBalancer where you have
a pool of connections, and pg_connect is _actually_ connecting to
DBBalancer in a very low-overhead manner and you have a pool of
connections out the back.  I am the Debian package maintainer for
DBBalancer.

You may also want to consider differentiating based on whether the
application is writing to the database or not.  Pooling and persistent
connections can give weird side-effects if transaction scoping is
bollixed in the application - a second page view re-using an earlier
connection which was serving a different page could find itself in the
middle of an unexpected transaction.  Temp tables are one thing that can
bite you here.

There are a few database pooling solutions out there. Using pg_pconnect
is the simplest of these, DBBalancer fixes some of it's issues, and
others go further still.

Another point to consider is that database pooling will give you the
biggest performance increase if your queries are all returning small
datasets.  If you return large datasets it can potentially make things
worse (depending on implementation) through double-handling of the data.

As others have said too: 100 is just a configuration setting in
postgresql.conf - not an implemented limit.

Cheers,
Andrew McMillan.
-
Andrew @ Catalyst .Net .NZ  Ltd,  PO Box 11-053,  Manners St,  Wellington
WEB: http://catalyst.net.nz/ PHYS: Level 2, 150-154 Willis St
DDI: +64(4)916-7201   MOB: +64(21)635-694  OFFICE: +64(4)499-2267
  How many things I can do without! - Socrates
-

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]