[PERFORM] profiling postgresql queries?
hi i'm not totally sure i should ask on this mailing list - so if you think i should better ask someplace else, please let me know. the problem i have is that specific queries (inserts and updates) take a long time to run. of course i do vacuum analyze frequently. i also use explain analyze on queries. the problem is that both the inserts and updated operate on heavy-tirggered tables. and it made me wonder - is there a way to tell how much time of backend was spent on triggers, index updates and so on? like: total query time: 1 secons trigger a: 0.50 second trigger b: 0.25 second index update: 0.1 second something like this. is it possible? will it be ever possible? hubert -- hubert lubaczewski Network Operations Center eo Networks Sp. z o.o. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] 4 way JOIN using aliases
-Original Message- From: Keith Worthington [mailto:[EMAIL PROTECTED] Sent: Monday, April 11, 2005 7:44 PM To: Neil Conway Cc: PostgreSQL Perform Subject: Re: [PERFORM] 4 way JOIN using aliases Neil Conway wrote: Keith Worthington wrote: - Seq Scan on tbl_current (cost=0.00..1775.57 rows=76457 width=31) (actual time=22.870..25.024 rows=605 loops=1) This rowcount is way off -- have you run ANALYZE recently? [...] I run vacuumdb with the analyze option every morning via a cron job. In my ignorance I do not know if that is the same thing. Pass it an --analyze option if you aren't already. __ David B. Held Software Engineer/Array Services Group 200 14th Ave. East, Sartell, MN 56377 320.534.3637 320.253.7800 800.752.8129 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] profiling postgresql queries?
hubert lubaczewski [EMAIL PROTECTED] writes: and it made me wonder - is there a way to tell how much time of backend was spent on triggers, index updates and so on? In CVS tip, EXPLAIN ANALYZE will break out the time spent in each trigger. This is not in any released version, but if you're desperate you could load up a play server with your data and test. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] Slow update
Bendik R.Johansen [EMAIL PROTECTED] writes: I am having a bit of trouble updating a single integer column. My table has around 10 columns and 260 000 records. update no.records set uid = 2; (uid is an integer. It has a btree index) This update takes more than 20 minutes to execute. Is this normal? Takes about 20 seconds to update a table of that size on my machine... What PG version is this? We used to have some performance issues with very large numbers of equal keys in btree indexes. Does dropping the index make it faster? Are there foreign keys referencing this table from other tables? regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] profiling postgresql queries?
On Tue, Apr 12, 2005 at 10:18:31AM -0400, Alex Turner wrote: Speaking of triggers... Is there any plan to speed up plpgsql tiggers? Fairly simple crosstable insert triggers seem to slow my inserts to a crawl. plpgsql is quite fast actually. if some triggers slow inserts too much, i guess you should be able to spped them up with some performance review of trigger code. depesz -- hubert lubaczewski Network Operations Center eo Networks Sp. z o.o. signature.asc Description: Digital signature
Re: [PERFORM] profiling postgresql queries?
On Tue, Apr 12, 2005 at 12:46:43PM +0200, hubert lubaczewski wrote: the problem is that both the inserts and updated operate on heavy-tirggered tables. and it made me wonder - is there a way to tell how much time of backend was spent on triggers, index updates and so on? like: total query time: 1 secons trigger a: 0.50 second trigger b: 0.25 second index update: 0.1 second EXPLAIN ANALYZE in 8.1devel (CVS HEAD) prints a few statistics for triggers: EXPLAIN ANALYZE UPDATE foo SET x = 10 WHERE x = 20; QUERY PLAN -- Index Scan using foo_x_idx on foo (cost=0.00..14.44 rows=10 width=22) (actual time=0.184..0.551 rows=7 loops=1) Index Cond: (x = 20) Trigger row_trig1: time=1.625 calls=7 Trigger row_trig2: time=1.346 calls=7 Trigger stmt_trig1: time=1.436 calls=1 Total runtime: 9.659 ms (6 rows) 8.1devel changes frequently (sometimes requiring initdb) and isn't suitable for production, but if the trigger statistics would be helpful then you could set up a test server and load a copy of your database into it. Just beware that because it's bleeding edge, it might destroy your data and it might behave differently than released versions. -- Michael Fuhr http://www.fuhr.org/~mfuhr/ ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] profiling postgresql queries?
On Tue, Apr 12, 2005 at 08:43:59AM -0600, Michael Fuhr wrote: 8.1devel changes frequently (sometimes requiring initdb) and isn't suitable for production, but if the trigger statistics would be helpful then you could set up a test server and load a copy of your database into it. Just beware that because it's bleeding edge, it might destroy your data and it might behave differently than released versions. great. this is exactly what i need. thanks for hint. depesz -- hubert lubaczewski Network Operations Center eo Networks Sp. z o.o. signature.asc Description: Digital signature
Re: [PERFORM] Slow update
Bendik R. Johansen [EMAIL PROTECTED] writes: Below is the schema for the table i will be using. I tried dropping the index, but it did not help. Indexes: records_pkey PRIMARY KEY, btree (id) records_category_idx gist (category) records_cid_idx btree (cid) records_uid_idx btree (uid) Hmm ... my suspicion would fall first on the GIST index, to tell you the truth. Did you try dropping that one? regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Slow update
Yes, I tried dropping it but it did not make a difference. Could the table be corrupt or something? Well, the important thing is that I now know that this is not typical for PostgreSQL, so I will not have to rethink my whole project. Thanks, so far. On Apr 12, 2005, at 17:16, Tom Lane wrote: Bendik R. Johansen [EMAIL PROTECTED] writes: Below is the schema for the table i will be using. I tried dropping the index, but it did not help. Indexes: records_pkey PRIMARY KEY, btree (id) records_category_idx gist (category) records_cid_idx btree (cid) records_uid_idx btree (uid) Hmm ... my suspicion would fall first on the GIST index, to tell you the truth. Did you try dropping that one? regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Slow update
Bendik R. Johansen [EMAIL PROTECTED] writes: Yes, I tried dropping it but it did not make a difference. Could the table be corrupt or something? You didn't directly answer the question about whether there were foreign keys leading to this table. Checking foreign keys could be the problem, particularly if the referencing columns don't have indexes. Also, maybe the table is just bloated? What does VACUUM VERBOSE say about it? regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[PERFORM] performance hit for replication
I'd like to create a fail-over server in case of a problem. Ideally, it would be synchronized with our main database server, but I don't see any major problem with having a delay of up to 4 hours between syncs. My database is a little shy of 10 Gigs, with much of that data being in an archived log table. Every day a batch job is run which adds 100,000 records over the course of 3 hours (the batch job does a lot of pre/post processing). Doing a restore of the db backup in vmware takes about 3 hours. I suspect a powerful server with a better disk setup could do it faster, but I don't have servers like that at my disposal, so I need to assume worst-case of 3-4 hours is typical. So, my question is this: My server currently works great, performance wise. I need to add fail-over capability, but I'm afraid that introducing a stressful task such as replication will hurt my server's performance. Is there any foundation to my fears? I don't need to replicate the archived log data because I can easily restore that in a separate step from the nightly backup if disaster occurs. Also, my database load is largely selects. My application works great with PostgreSQL 7.3 and 7.4, but I'm currently using 7.3. I'm eager to hear your thoughts and experiences, -- Matthew Nuzum [EMAIL PROTECTED] www.followers.net - Makers of Elite Content Management System Earn a commission of $100 - $750 by recommending Elite CMS. Visit http://www.elitecms.com/Contact_Us.partner for details. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] performance hit for replication
So, my question is this: My server currently works great, performance wise. I need to add fail-over capability, but I'm afraid that introducing a stressful task such as replication will hurt my server's performance. Is there any foundation to my fears? I don't need to replicate the archived log data because I can easily restore that in a separate step from the nightly backup if disaster occurs. Also, my database load is largely selects. My application works great with PostgreSQL 7.3 and 7.4, but I'm currently using 7.3. I'm eager to hear your thoughts and experiences, Well with replicator you are going to take a pretty big hit initially during the full sync but then you could use batch replication and only replicate every 2-3 hours. I am pretty sure Slony has similar capabilities. Sincerely, Joshua D. Drake ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] performance hit for replication
I'm eager to hear your thoughts and experiences, Well with replicator you are going to take a pretty big hit initially during the full sync but then you could use batch replication and only replicate every 2-3 hours. Sincerely, Joshua D. Drake Thanks, I'm looking at your product and will contact you off list for more details soon. Out of curiosity, does batch mode produce a lighter load? Live updating will provide maximum data security, and I'm most interested in how it affects the server. -- Matthew Nuzum [EMAIL PROTECTED] www.followers.net - Makers of Elite Content Management System Earn a commission of $100 - $750 by recommending Elite CMS. Visit http://www.elitecms.com/Contact_Us.partner for details. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] performance hit for replication
On Tuesday 12 April 2005 09:25, Matthew Nuzum wrote: I'd like to create a fail-over server in case of a problem. Ideally, it would be synchronized with our main database server, but I don't see any major problem with having a delay of up to 4 hours between syncs. My database is a little shy of 10 Gigs, with much of that data being in an archived log table. Every day a batch job is run which adds 100,000 records over the course of 3 hours (the batch job does a lot of pre/post processing). Doing a restore of the db backup in vmware takes about 3 hours. I suspect a powerful server with a better disk setup could do it faster, but I don't have servers like that at my disposal, so I need to assume worst-case of 3-4 hours is typical. So, my question is this: My server currently works great, performance wise. I need to add fail-over capability, but I'm afraid that introducing a stressful task such as replication will hurt my server's performance. Is there any foundation to my fears? I don't need to replicate the archived log data because I can easily restore that in a separate step from the nightly backup if disaster occurs. Also, my database load is largely selects. My application works great with PostgreSQL 7.3 and 7.4, but I'm currently using 7.3. I'm eager to hear your thoughts and experiences, Your application sounds like a perfact candidate for Slony-I http://www.slony.info . Using Slony-I I see about a 5-7% performance hit in terms of the number of insert.update/delete per second i can process. Depending on your network connection , DML volume, and the power of your backup server, the replica could be as little as 10 seconds behind the origin. A failover/switchover could occur in under 60 seconds. -- Darcy Buskermolen Wavefire Technologies Corp. http://www.wavefire.com ph: 250.717.0200 fx: 250.763.1759 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] performance hit for replication
Matthew Nuzum wrote: I'm eager to hear your thoughts and experiences, Well with replicator you are going to take a pretty big hit initially during the full sync but then you could use batch replication and only replicate every 2-3 hours. Sincerely, Joshua D. Drake Thanks, I'm looking at your product and will contact you off list for more details soon. Out of curiosity, does batch mode produce a lighter load? Well more of a burstier load. You could also do live replication but replicator requires some IO which VMWare just ins't that good at :) Sincerely, Joshua D. Drake ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] 4 way JOIN using aliases
On Tue, 12 Apr 2005 08:41:55 -0500, Dave Held wrote -Original Message- From: Keith Worthington [mailto:[EMAIL PROTECTED] Sent: Monday, April 11, 2005 7:44 PM To: Neil Conway Cc: PostgreSQL Perform Subject: Re: [PERFORM] 4 way JOIN using aliases Neil Conway wrote: Keith Worthington wrote: - Seq Scan on tbl_current (cost=0.00..1775.57 rows=76457 width=31) (actual time=22.870..25.024 rows=605 loops=1) This rowcount is way off -- have you run ANALYZE recently? [...] I run vacuumdb with the analyze option every morning via a cron job. In my ignorance I do not know if that is the same thing. Pass it an --analyze option if you aren't already. __ David B. Held Here is the command I have in the cron file. vacuumdb --full --analyze --verbose --username dbuser --dbname ${IPA_DB} ${IPA_LOG_DIR}/ipavcmdb.log 21 If this performs the analyze as I thought it should I do not know why the row count is so badly off. Kind Regards, Keith ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] 4 way JOIN using aliases
-Original Message- From: Keith Worthington [mailto:[EMAIL PROTECTED] Sent: Tuesday, April 12, 2005 1:14 PM To: Dave Held; PostgreSQL Perform Subject: Re: [PERFORM] 4 way JOIN using aliases I run vacuumdb with the analyze option every morning via a cron job. In my ignorance I do not know if that is the same thing. Pass it an --analyze option if you aren't already. Here is the command I have in the cron file. vacuumdb --full --analyze --verbose --username dbuser --dbname ${IPA_DB} ${IPA_LOG_DIR}/ipavcmdb.log 21 If this performs the analyze as I thought it should I do not know why the row count is so badly off. You may need to increase the statistics target for the relevant columns. Look at: http://www.postgresql.org/docs/7.4/static/sql-altertable.html In particular, the SET STATISTICS clause. __ David B. Held Software Engineer/Array Services Group 200 14th Ave. East, Sartell, MN 56377 320.534.3637 320.253.7800 800.752.8129 ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] performance hit for replication
[EMAIL PROTECTED] (Joshua D. Drake) writes: So, my question is this: My server currently works great, performance wise. I need to add fail-over capability, but I'm afraid that introducing a stressful task such as replication will hurt my server's performance. Is there any foundation to my fears? I don't need to replicate the archived log data because I can easily restore that in a separate step from the nightly backup if disaster occurs. Also, my database load is largely selects. My application works great with PostgreSQL 7.3 and 7.4, but I'm currently using 7.3. I'm eager to hear your thoughts and experiences, Well with replicator you are going to take a pretty big hit initially during the full sync but then you could use batch replication and only replicate every 2-3 hours. I am pretty sure Slony has similar capabilities. Yes, similar capabilities, similar pretty big hit. There's a downside to batch replication that some of the data structures grow in size if you have appreciable periods between batches. -- (format nil [EMAIL PROTECTED] cbbrowne acm.org) http://www.ntlug.org/~cbbrowne/slony.html Rules of the Evil Overlord #78. I will not tell my Legions of Terror And he must be taken alive! The command will be: ``And try to take him alive if it is reasonably practical.'' http://www.eviloverlord.com/ ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] Many connections lingering
Hi all, I've just noticed an interesting behaviour with PGSQL. My software is made up of few different modules that interact through PGSQL database. Almost every query they do is an individual transaction and there is a good reason for that. After every query done there is some processing done by those modules and I didn't want to lock the database in a single transaction while that processing is happening. Now, the interesting behaviour is this. I've ran netstat on the machine where my software is running and I searched for tcp connections to my PGSQL server. What i found was hundreds of lines like this: tcp0 0 remus.dstc.monash:43001 remus.dstc.monash:39504 TIME_WAIT tcp0 0 remus.dstc.monash:43001 remus.dstc.monash:40720 TIME_WAIT tcp0 0 remus.dstc.monash:43001 remus.dstc.monash:39135 TIME_WAIT tcp0 0 remus.dstc.monash:43002 remus.dstc.monash:41631 TIME_WAIT tcp0 0 remus.dstc.monash:43001 remus.dstc.monash:41119 TIME_WAIT tcp0 0 remus.dstc.monash:43001 remus.dstc.monash:41311 TIME_WAIT tcp0 0 remus.dstc.monash.:8649 remus.dstc.monash:41369 TIME_WAIT tcp0 0 remus.dstc.monash:43001 remus.dstc.monash:40479 TIME_WAIT tcp0 0 remus.dstc.monash:43001 remus.dstc.monash:39454 TIME_WAIT tcp0 0 remus.dstc.monash:43001 remus.dstc.monash:39133 TIME_WAIT tcp0 0 remus.dstc.monash:43002 remus.dstc.monash:41501 TIME_WAIT tcp0 0 remus.dstc.monash:43001 remus.dstc.monash:39132 TIME_WAIT tcp0 0 remus.dstc.monash:43001 remus.dstc.monash:41308 TIME_WAIT tcp0 0 remus.dstc.monash:43002 remus.dstc.monash:40667 TIME_WAIT tcp0 0 remus.dstc.monash:43001 remus.dstc.monash:41179 TIME_WAIT tcp0 0 remus.dstc.monash:43001 remus.dstc.monash:39323 TIME_WAIT tcp0 0 remus.dstc.monash:43001 remus.dstc.monash:41434 TIME_WAIT tcp0 0 remus.dstc.monash:43002 remus.dstc.monash:40282 TIME_WAIT tcp0 0 remus.dstc.monash:43001 remus.dstc.monash:41050 TIME_WAIT tcp0 0 remus.dstc.monash:43001 remus.dstc.monash:41177 TIME_WAIT tcp0 0 remus.dstc.monash:43001 remus.dstc.monash:39001 TIME_WAIT tcp0 0 remus.dstc.monash:43001 remus.dstc.monash:41305 TIME_WAIT tcp0 0 remus.dstc.monash:43001 remus.dstc.monash:38937 TIME_WAIT tcp0 0 remus.dstc.monash:43001 remus.dstc.monash:39128 TIME_WAIT tcp0 0 remus.dstc.monash:43001 remus.dstc.monash:40600 TIME_WAIT tcp0 0 remus.dstc.monash:43002 remus.dstc.monash:41624 TIME_WAIT tcp0 0 remus.dstc.monash:43002 remus.dstc.monash:39000 TIME_WAIT Now could someone explain to me what this really means and what effect it might have on the machine (the same machine where I ran this query)? Would there eventually be a shortage of available ports if this kept growing? The reason I am asking this is because one of my modules was raising exception saying that TCP connection could not be establish to a server it needed to connect to. This may sound confusing so I'll try to explain this. We have this scenario, there is a PGSQL server (postmaster) which is running on machine A. Then there is a custom server called DBServer which is running on machine B. This server accepts connections from a client called an Agent. Agent may ran on any machine out there and it would connect back to DBServer asking for some information. The communication between these two is in the form of SQL queries. When agent sends a query to DBServer it passes that query to machine A postmaster and then passes back the result of the query to that Agent. The connection problem I mentioned in the paragraph above happens when Agent tries to connect to DBServer. So the only question I have here is would those lingering socket connections above have any effect on the problem I am having. If not I am sorry for bothering you all with this, if yes I would like to know what I could do to avoid that. Any help would be appreciated, Regards, Slavisa ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] [NOVICE] Many connections lingering
Slavisa Garic [EMAIL PROTECTED] writes: ... Now, the interesting behaviour is this. I've ran netstat on the machine where my software is running and I searched for tcp connections to my PGSQL server. What i found was hundreds of lines like this: tcp0 0 remus.dstc.monash:43001 remus.dstc.monash:39504 TIME_WAIT tcp0 0 remus.dstc.monash:43001 remus.dstc.monash:40720 TIME_WAIT tcp0 0 remus.dstc.monash:43001 remus.dstc.monash:39135 TIME_WAIT This is a network-level issue: the TCP stack on your machine knows the connection has been closed, but it hasn't seen an acknowledgement of that fact from the other machine, and so it's remembering the connection number so that it can definitively say that connection is closed if the other machine asks. I'd guess that either you have a flaky network or there's something bogus about the TCP stack on the client machine. An occasional dropped FIN packet is no surprise, but hundreds of 'em are suspicious. Now could someone explain to me what this really means and what effect it might have on the machine (the same machine where I ran this query)? Would there eventually be a shortage of available ports if this kept growing? The reason I am asking this is because one of my modules was raising exception saying that TCP connection could not be establish to a server it needed to connect to. That kinda sounds like flaky network to me, but I could be wrong. In any case, you'd have better luck asking kernel or network hackers about this than database weenies ;-) regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] [NOVICE] Many connections lingering
Tom Lane [EMAIL PROTECTED] writes: Slavisa Garic [EMAIL PROTECTED] writes: ... Now, the interesting behaviour is this. I've ran netstat on the machine where my software is running and I searched for tcp connections to my PGSQL server. What i found was hundreds of lines like this: tcp0 0 remus.dstc.monash:43001 remus.dstc.monash:39504 TIME_WAIT tcp0 0 remus.dstc.monash:43001 remus.dstc.monash:40720 TIME_WAIT tcp0 0 remus.dstc.monash:43001 remus.dstc.monash:39135 TIME_WAIT This is a network-level issue: the TCP stack on your machine knows the connection has been closed, but it hasn't seen an acknowledgement of that fact from the other machine, and so it's remembering the connection number so that it can definitively say that connection is closed if the other machine asks. I'd guess that either you have a flaky network or there's something bogus about the TCP stack on the client machine. An occasional dropped FIN packet is no surprise, but hundreds of 'em are suspicious. No, what Tom's describing is a different pair of states called FIN_WAIT_1 and FIN_WAIT_2. TIME_WAIT isn't waiting for a packet, just a timeout. This is to prevent any delayed packets from earlier in the connection causing problems with a subsequent good connection. Otherwise you could get data from the old connection mixed in the data for later ones. Now could someone explain to me what this really means and what effect it might have on the machine (the same machine where I ran this query)? Would there eventually be a shortage of available ports if this kept growing? The reason I am asking this is because one of my modules was raising exception saying that TCP connection could not be establish to a server it needed to connect to. What it does indicate is that each query you're making is probably not just a separate transaction but a separate TCP connection. That's probably not necessary. If you have a single long-lived process you could just keep the TCP connection open and issue a COMMIT after each transaction. That's what I would recommend doing. Unless you have thousands of these TIME_WAIT connections they probably aren't actually directly the cause of your failure to establish connections. But yes it can happen. What's more likely happening here is that you're stressing the server by issuing so many connection attempts that you're triggering some bug, either in the TCP stack or Postgres that is causing some connection attempts to not be handled properly. I'm skeptical that there's a bug in Postgres since lots of people do in fact run web servers configured to open a new connection for every page. But this wouldn't happen to be a Windows server would it? Perhaps the networking code in that port doesn't do the right thing in this case? -- greg ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] [NOVICE] Many connections lingering
Greg Stark [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: This is a network-level issue: the TCP stack on your machine knows the connection has been closed, but it hasn't seen an acknowledgement of that fact from the other machine, and so it's remembering the connection number so that it can definitively say that connection is closed if the other machine asks. No, what Tom's describing is a different pair of states called FIN_WAIT_1 and FIN_WAIT_2. TIME_WAIT isn't waiting for a packet, just a timeout. D'oh, obviously it's been too many years since I read Stevens ;-) So AFAICS this status report doesn't actually indicate any problem, other than massively profligate use of separate connections. Greg's correct that there's some risk of resource exhaustion at the TCP level, but it's not very likely. I'd be more concerned about the amount of resources wasted in starting a separate Postgres backend for each connection. PG backends are fairly heavyweight objects --- if you are at all concerned about performance, you want to get a decent number of queries done in each connection. Consider using a connection pooler. regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] [NOVICE] Many connections lingering
Hi Greg, This is not a Windows server. Both server and client are the same machine (done for testing purposes) and it is a Fedora RC2 machine. This also happens on debian server and client in which case they were two separate machines. There are thousands (2+) of these waiting around and each one of them dissapears after 50ish seconds. I tried psql command line and monitored that connection in netstats. After I did a graceful exit (\quit) the connection changed to TIME_WAIT and it was sitting there for around 50 seconds. I thought I could do what you suggested with having one connection and making each query a full BEGIN/QUERY/COMMIT transaction but I thought I could avoid that :). This is a serious problem for me as there are multiple users using our software on our server and I would want to avoid having connections open for a long time. In the scenario mentioned below I haven't explained the magnitute of the communications happening between Agents and DBServer. There could possibly be 100 or more Agents per experiment, per user running on remote machines at the same time, hence we need short transactions/pgsql connections. Agents need a reliable connection because failure to connect could mean a loss of computation results that were gathered over long periods of time. Thanks for the help by the way :), Regards, Slavisa On 12 Apr 2005 23:27:09 -0400, Greg Stark [EMAIL PROTECTED] wrote: Tom Lane [EMAIL PROTECTED] writes: Slavisa Garic [EMAIL PROTECTED] writes: ... Now, the interesting behaviour is this. I've ran netstat on the machine where my software is running and I searched for tcp connections to my PGSQL server. What i found was hundreds of lines like this: tcp0 0 remus.dstc.monash:43001 remus.dstc.monash:39504 TIME_WAIT tcp0 0 remus.dstc.monash:43001 remus.dstc.monash:40720 TIME_WAIT tcp0 0 remus.dstc.monash:43001 remus.dstc.monash:39135 TIME_WAIT This is a network-level issue: the TCP stack on your machine knows the connection has been closed, but it hasn't seen an acknowledgement of that fact from the other machine, and so it's remembering the connection number so that it can definitively say that connection is closed if the other machine asks. I'd guess that either you have a flaky network or there's something bogus about the TCP stack on the client machine. An occasional dropped FIN packet is no surprise, but hundreds of 'em are suspicious. No, what Tom's describing is a different pair of states called FIN_WAIT_1 and FIN_WAIT_2. TIME_WAIT isn't waiting for a packet, just a timeout. This is to prevent any delayed packets from earlier in the connection causing problems with a subsequent good connection. Otherwise you could get data from the old connection mixed in the data for later ones. Now could someone explain to me what this really means and what effect it might have on the machine (the same machine where I ran this query)? Would there eventually be a shortage of available ports if this kept growing? The reason I am asking this is because one of my modules was raising exception saying that TCP connection could not be establish to a server it needed to connect to. What it does indicate is that each query you're making is probably not just a separate transaction but a separate TCP connection. That's probably not necessary. If you have a single long-lived process you could just keep the TCP connection open and issue a COMMIT after each transaction. That's what I would recommend doing. Unless you have thousands of these TIME_WAIT connections they probably aren't actually directly the cause of your failure to establish connections. But yes it can happen. What's more likely happening here is that you're stressing the server by issuing so many connection attempts that you're triggering some bug, either in the TCP stack or Postgres that is causing some connection attempts to not be handled properly. I'm skeptical that there's a bug in Postgres since lots of people do in fact run web servers configured to open a new connection for every page. But this wouldn't happen to be a Windows server would it? Perhaps the networking code in that port doesn't do the right thing in this case? -- greg ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly