[PERFORM] profiling postgresql queries?

2005-04-12 Thread hubert lubaczewski
hi
i'm not totally sure i should ask on this mailing list - so if you think
i should better ask someplace else, please let me know.

the problem i have is that specific queries (inserts and updates) take a
long time to run.

of course i do vacuum analyze frequently. i also use explain analyze on
queries.

the problem is that both the inserts and updated operate on
heavy-tirggered tables.
and it made me wonder - is there a way to tell how much time of backend
was spent on triggers, index updates and so on?
like:
total query time: 1 secons
trigger a: 0.50 second
trigger b: 0.25 second
index update: 0.1 second

something like this.

is it possible?
will it be ever possible?

hubert

-- 
hubert lubaczewski
Network Operations Center
eo Networks Sp. z o.o.

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] 4 way JOIN using aliases

2005-04-12 Thread Dave Held
 -Original Message-
 From: Keith Worthington [mailto:[EMAIL PROTECTED]
 Sent: Monday, April 11, 2005 7:44 PM
 To: Neil Conway
 Cc: PostgreSQL Perform
 Subject: Re: [PERFORM] 4 way JOIN using aliases
 
 Neil Conway wrote:
  Keith Worthington wrote:
  
-  Seq Scan on tbl_current  
 (cost=0.00..1775.57 rows=76457
  width=31) (actual time=22.870..25.024 rows=605 loops=1)
  
  
  This rowcount is way off -- have you run ANALYZE recently?
  [...]
 
 I run vacuumdb with the analyze option every morning via a 
 cron job.  In my ignorance I do not know if that is the same
 thing.

Pass it an --analyze option if you aren't already.

__
David B. Held
Software Engineer/Array Services Group
200 14th Ave. East,  Sartell, MN 56377
320.534.3637 320.253.7800 800.752.8129

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] profiling postgresql queries?

2005-04-12 Thread Tom Lane
hubert lubaczewski [EMAIL PROTECTED] writes:
 and it made me wonder - is there a way to tell how much time of backend
 was spent on triggers, index updates and so on?

In CVS tip, EXPLAIN ANALYZE will break out the time spent in each
trigger.  This is not in any released version, but if you're desperate
you could load up a play server with your data and test.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] Slow update

2005-04-12 Thread Tom Lane
Bendik R.Johansen [EMAIL PROTECTED] writes:
 I am having a bit of trouble updating a single integer column.
 My table has around 10 columns and 260 000 records.

 update no.records set uid = 2;
 (uid is an integer. It has a btree index)

 This update takes more than 20 minutes to execute. Is this normal?

Takes about 20 seconds to update a table of that size on my machine...

What PG version is this?  We used to have some performance issues with
very large numbers of equal keys in btree indexes.  Does dropping the
index make it faster?

Are there foreign keys referencing this table from other tables?

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] profiling postgresql queries?

2005-04-12 Thread hubert lubaczewski
On Tue, Apr 12, 2005 at 10:18:31AM -0400, Alex Turner wrote:
 Speaking of triggers...
 Is there any plan to speed up plpgsql tiggers?  Fairly simple
 crosstable insert triggers seem to slow my inserts to a crawl.

plpgsql is quite fast actually. if some triggers slow inserts too much,
i guess you should be able to spped them up with some performance review
of trigger code.

depesz

-- 
hubert lubaczewski
Network Operations Center
eo Networks Sp. z o.o.


signature.asc
Description: Digital signature


Re: [PERFORM] profiling postgresql queries?

2005-04-12 Thread Michael Fuhr
On Tue, Apr 12, 2005 at 12:46:43PM +0200, hubert lubaczewski wrote:
 
 the problem is that both the inserts and updated operate on
 heavy-tirggered tables.
 and it made me wonder - is there a way to tell how much time of backend
 was spent on triggers, index updates and so on?
 like:
 total query time: 1 secons
 trigger a: 0.50 second
 trigger b: 0.25 second
 index update: 0.1 second

EXPLAIN ANALYZE in 8.1devel (CVS HEAD) prints a few statistics for
triggers:

EXPLAIN ANALYZE UPDATE foo SET x = 10 WHERE x = 20;
QUERY PLAN  
  
--
 Index Scan using foo_x_idx on foo  (cost=0.00..14.44 rows=10 width=22) (actual 
time=0.184..0.551 rows=7 loops=1)
   Index Cond: (x = 20)
 Trigger row_trig1: time=1.625 calls=7
 Trigger row_trig2: time=1.346 calls=7
 Trigger stmt_trig1: time=1.436 calls=1
 Total runtime: 9.659 ms
(6 rows)

8.1devel changes frequently (sometimes requiring initdb) and isn't
suitable for production, but if the trigger statistics would be
helpful then you could set up a test server and load a copy of your
database into it.  Just beware that because it's bleeding edge, it
might destroy your data and it might behave differently than released
versions.

-- 
Michael Fuhr
http://www.fuhr.org/~mfuhr/

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] profiling postgresql queries?

2005-04-12 Thread hubert lubaczewski
On Tue, Apr 12, 2005 at 08:43:59AM -0600, Michael Fuhr wrote:
 8.1devel changes frequently (sometimes requiring initdb) and isn't
 suitable for production, but if the trigger statistics would be
 helpful then you could set up a test server and load a copy of your
 database into it.  Just beware that because it's bleeding edge, it
 might destroy your data and it might behave differently than released
 versions.

great. this is exactly what i need. thanks for hint.

depesz

-- 
hubert lubaczewski
Network Operations Center
eo Networks Sp. z o.o.


signature.asc
Description: Digital signature


Re: [PERFORM] Slow update

2005-04-12 Thread Tom Lane
Bendik R. Johansen [EMAIL PROTECTED] writes:
 Below is the schema for the table i will be using. I tried dropping the  
 index, but it did not help.

 Indexes:
  records_pkey PRIMARY KEY, btree (id)
  records_category_idx gist (category)
  records_cid_idx btree (cid)
  records_uid_idx btree (uid)

Hmm ... my suspicion would fall first on the GIST index, to tell you the
truth.  Did you try dropping that one?

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Slow update

2005-04-12 Thread Bendik R . Johansen
Yes, I tried dropping it but it did not make a difference.
Could the table be corrupt or something?
Well, the important thing is that I now know that this is not typical 
for PostgreSQL, so I will not have to rethink my whole project.

Thanks, so far.
On Apr 12, 2005, at 17:16, Tom Lane wrote:
Bendik R. Johansen [EMAIL PROTECTED] writes:
Below is the schema for the table i will be using. I tried dropping 
the
index, but it did not help.

Indexes:
 records_pkey PRIMARY KEY, btree (id)
 records_category_idx gist (category)
 records_cid_idx btree (cid)
 records_uid_idx btree (uid)
Hmm ... my suspicion would fall first on the GIST index, to tell you 
the
truth.  Did you try dropping that one?

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] Slow update

2005-04-12 Thread Tom Lane
Bendik R. Johansen [EMAIL PROTECTED] writes:
 Yes, I tried dropping it but it did not make a difference.
 Could the table be corrupt or something?

You didn't directly answer the question about whether there were foreign
keys leading to this table.  Checking foreign keys could be the problem,
particularly if the referencing columns don't have indexes.

Also, maybe the table is just bloated?  What does VACUUM VERBOSE say
about it?

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[PERFORM] performance hit for replication

2005-04-12 Thread Matthew Nuzum
I'd like to create a fail-over server in case of a problem. Ideally, it
would be synchronized with our main database server, but I don't see any
major problem with having a delay of up to 4 hours between syncs.

My database is a little shy of 10 Gigs, with much of that data being in an
archived log table. Every day a batch job is run which adds 100,000 records
over the course of 3 hours (the batch job does a lot of pre/post
processing).

Doing a restore of the db backup in vmware takes about 3 hours. I suspect a
powerful server with a better disk setup could do it faster, but I don't
have servers like that at my disposal, so I need to assume worst-case of 3-4
hours is typical.

So, my question is this: My server currently works great, performance wise.
I need to add fail-over capability, but I'm afraid that introducing a
stressful task such as replication will hurt my server's performance. Is
there any foundation to my fears? I don't need to replicate the archived log
data because I can easily restore that in a separate step from the nightly
backup if disaster occurs. Also, my database load is largely selects. My
application works great with PostgreSQL 7.3 and 7.4, but I'm currently using
7.3. 

I'm eager to hear your thoughts and experiences,
-- 
Matthew Nuzum [EMAIL PROTECTED]
www.followers.net - Makers of Elite Content Management System
Earn a commission of $100 - $750 by recommending Elite CMS. Visit
http://www.elitecms.com/Contact_Us.partner for details.



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] performance hit for replication

2005-04-12 Thread Joshua D. Drake

So, my question is this: My server currently works great, performance wise.
I need to add fail-over capability, but I'm afraid that introducing a
stressful task such as replication will hurt my server's performance. Is
there any foundation to my fears? I don't need to replicate the archived log
data because I can easily restore that in a separate step from the nightly
backup if disaster occurs. Also, my database load is largely selects. My
application works great with PostgreSQL 7.3 and 7.4, but I'm currently using
7.3. 

I'm eager to hear your thoughts and experiences,
 

Well with replicator you are going to take a pretty big hit initially 
during the full
sync but then you could use batch replication and only replicate every 
2-3 hours.

I am pretty sure Slony has similar capabilities.
Sincerely,
Joshua D. Drake

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] performance hit for replication

2005-04-12 Thread Matthew Nuzum
 I'm eager to hear your thoughts and experiences,
 
 
 Well with replicator you are going to take a pretty big hit initially
 during the full
 sync but then you could use batch replication and only replicate every
 2-3 hours.
 
 Sincerely,
 
 Joshua D. Drake
 

Thanks, I'm looking at your product and will contact you off list for more
details soon.

Out of curiosity, does batch mode produce a lighter load? Live updating will
provide maximum data security, and I'm most interested in how it affects the
server.

-- 
Matthew Nuzum [EMAIL PROTECTED]
www.followers.net - Makers of Elite Content Management System
Earn a commission of $100 - $750 by recommending Elite CMS. Visit
http://www.elitecms.com/Contact_Us.partner for details.


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] performance hit for replication

2005-04-12 Thread Darcy Buskermolen
On Tuesday 12 April 2005 09:25, Matthew Nuzum wrote:
 I'd like to create a fail-over server in case of a problem. Ideally, it
 would be synchronized with our main database server, but I don't see any
 major problem with having a delay of up to 4 hours between syncs.

 My database is a little shy of 10 Gigs, with much of that data being in an
 archived log table. Every day a batch job is run which adds 100,000 records
 over the course of 3 hours (the batch job does a lot of pre/post
 processing).

 Doing a restore of the db backup in vmware takes about 3 hours. I suspect a
 powerful server with a better disk setup could do it faster, but I don't
 have servers like that at my disposal, so I need to assume worst-case of
 3-4 hours is typical.

 So, my question is this: My server currently works great, performance wise.
 I need to add fail-over capability, but I'm afraid that introducing a
 stressful task such as replication will hurt my server's performance. Is
 there any foundation to my fears? I don't need to replicate the archived
 log data because I can easily restore that in a separate step from the
 nightly backup if disaster occurs. Also, my database load is largely
 selects. My application works great with PostgreSQL 7.3 and 7.4, but I'm
 currently using 7.3.

 I'm eager to hear your thoughts and experiences,

Your application sounds like a perfact candidate for Slony-I 
http://www.slony.info . Using Slony-I I see about a 5-7% performance hit in 
terms of the number of insert.update/delete per second i can process.

Depending on your network connection , DML volume, and the power of your 
backup server, the replica could be as little as 10 seconds behind the 
origin.  A failover/switchover could occur in under 60 seconds.

-- 
Darcy Buskermolen
Wavefire Technologies Corp.

http://www.wavefire.com
ph: 250.717.0200
fx: 250.763.1759

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] performance hit for replication

2005-04-12 Thread Joshua D. Drake
Matthew Nuzum wrote:
I'm eager to hear your thoughts and experiences,
 

Well with replicator you are going to take a pretty big hit initially
during the full
sync but then you could use batch replication and only replicate every
2-3 hours.
Sincerely,
Joshua D. Drake
   

Thanks, I'm looking at your product and will contact you off list for more
details soon.
Out of curiosity, does batch mode produce a lighter load?
Well more of a burstier load. You could also do live replication but 
replicator requires
some IO which VMWare just ins't that good at :)

Sincerely,
Joshua D. Drake
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] 4 way JOIN using aliases

2005-04-12 Thread Keith Worthington
On Tue, 12 Apr 2005 08:41:55 -0500, Dave Held wrote
  -Original Message-
  From: Keith Worthington [mailto:[EMAIL PROTECTED]
  Sent: Monday, April 11, 2005 7:44 PM
  To: Neil Conway
  Cc: PostgreSQL Perform
  Subject: Re: [PERFORM] 4 way JOIN using aliases
  
  Neil Conway wrote:
   Keith Worthington wrote:
   
 -  Seq Scan on tbl_current  
  (cost=0.00..1775.57 rows=76457
   width=31) (actual time=22.870..25.024 rows=605 loops=1)
   
   
   This rowcount is way off -- have you run ANALYZE recently?
   [...]
  
  I run vacuumdb with the analyze option every morning via a 
  cron job.  In my ignorance I do not know if that is the same
  thing.
 
 Pass it an --analyze option if you aren't already.
 
 __
 David B. Held
 

Here is the command I have in the cron file.

vacuumdb --full --analyze --verbose --username dbuser --dbname ${IPA_DB} 
${IPA_LOG_DIR}/ipavcmdb.log 21

If this performs the analyze as I thought it should I do not know why the row
count is so badly off.

Kind Regards,
Keith

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] 4 way JOIN using aliases

2005-04-12 Thread Dave Held
 -Original Message-
 From: Keith Worthington [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, April 12, 2005 1:14 PM
 To: Dave Held; PostgreSQL Perform
 Subject: Re: [PERFORM] 4 way JOIN using aliases
 
   I run vacuumdb with the analyze option every morning via a 
   cron job.  In my ignorance I do not know if that is the same
   thing.
  
  Pass it an --analyze option if you aren't already.
 
 Here is the command I have in the cron file.
 
 vacuumdb --full --analyze --verbose --username dbuser 
 --dbname ${IPA_DB} 
 ${IPA_LOG_DIR}/ipavcmdb.log 21
 
 If this performs the analyze as I thought it should I do not 
 know why the row
 count is so badly off.

You may need to increase the statistics target for the relevant
columns.  Look at:

http://www.postgresql.org/docs/7.4/static/sql-altertable.html

In particular, the SET STATISTICS clause.

__
David B. Held
Software Engineer/Array Services Group
200 14th Ave. East,  Sartell, MN 56377
320.534.3637 320.253.7800 800.752.8129

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] performance hit for replication

2005-04-12 Thread Chris Browne
[EMAIL PROTECTED] (Joshua D. Drake) writes:
So, my question is this: My server currently works great,
performance wise.  I need to add fail-over capability, but I'm
afraid that introducing a stressful task such as replication will
hurt my server's performance. Is there any foundation to my fears? I
don't need to replicate the archived log data because I can easily
restore that in a separate step from the nightly backup if disaster
occurs. Also, my database load is largely selects. My application
works great with PostgreSQL 7.3 and 7.4, but I'm currently using
7.3.

I'm eager to hear your thoughts and experiences,

 Well with replicator you are going to take a pretty big hit
 initially during the full sync but then you could use batch
 replication and only replicate every 2-3 hours.

 I am pretty sure Slony has similar capabilities.

Yes, similar capabilities, similar pretty big hit.

There's a downside to batch replication that some of the data
structures grow in size if you have appreciable periods between
batches.
-- 
(format nil [EMAIL PROTECTED] cbbrowne acm.org)
http://www.ntlug.org/~cbbrowne/slony.html
Rules of the Evil Overlord #78.  I will not tell my Legions of Terror
And he must  be taken alive! The command will be:  ``And try to take
him alive if it is reasonably practical.''
http://www.eviloverlord.com/

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


[PERFORM] Many connections lingering

2005-04-12 Thread Slavisa Garic
Hi all,

I've just noticed an interesting behaviour with PGSQL. My software is
made up of few different modules that interact through PGSQL database.
Almost every query they do is an individual transaction and there is a
good reason for that. After every query done there is some processing
done by those modules and I didn't want to lock the database in a
single transaction while that processing is happening. Now, the
interesting behaviour is this. I've ran netstat on the machine where
my software is running and I searched for tcp connections to my PGSQL
server. What i found was hundreds of lines like this:

tcp0  0 remus.dstc.monash:43001 remus.dstc.monash:39504 TIME_WAIT   
tcp0  0 remus.dstc.monash:43001 remus.dstc.monash:40720 TIME_WAIT   
tcp0  0 remus.dstc.monash:43001 remus.dstc.monash:39135 TIME_WAIT   
tcp0  0 remus.dstc.monash:43002 remus.dstc.monash:41631 TIME_WAIT   
tcp0  0 remus.dstc.monash:43001 remus.dstc.monash:41119 TIME_WAIT   
tcp0  0 remus.dstc.monash:43001 remus.dstc.monash:41311 TIME_WAIT   
tcp0  0 remus.dstc.monash.:8649 remus.dstc.monash:41369 TIME_WAIT   
tcp0  0 remus.dstc.monash:43001 remus.dstc.monash:40479 TIME_WAIT   
tcp0  0 remus.dstc.monash:43001 remus.dstc.monash:39454 TIME_WAIT   
tcp0  0 remus.dstc.monash:43001 remus.dstc.monash:39133 TIME_WAIT   
tcp0  0 remus.dstc.monash:43002 remus.dstc.monash:41501 TIME_WAIT   
tcp0  0 remus.dstc.monash:43001 remus.dstc.monash:39132 TIME_WAIT   
tcp0  0 remus.dstc.monash:43001 remus.dstc.monash:41308 TIME_WAIT   
tcp0  0 remus.dstc.monash:43002 remus.dstc.monash:40667 TIME_WAIT   
tcp0  0 remus.dstc.monash:43001 remus.dstc.monash:41179 TIME_WAIT   
tcp0  0 remus.dstc.monash:43001 remus.dstc.monash:39323 TIME_WAIT   
tcp0  0 remus.dstc.monash:43001 remus.dstc.monash:41434 TIME_WAIT   
tcp0  0 remus.dstc.monash:43002 remus.dstc.monash:40282 TIME_WAIT   
tcp0  0 remus.dstc.monash:43001 remus.dstc.monash:41050 TIME_WAIT   
tcp0  0 remus.dstc.monash:43001 remus.dstc.monash:41177 TIME_WAIT   
tcp0  0 remus.dstc.monash:43001 remus.dstc.monash:39001 TIME_WAIT   
tcp0  0 remus.dstc.monash:43001 remus.dstc.monash:41305 TIME_WAIT   
tcp0  0 remus.dstc.monash:43001 remus.dstc.monash:38937 TIME_WAIT   
tcp0  0 remus.dstc.monash:43001 remus.dstc.monash:39128 TIME_WAIT   
tcp0  0 remus.dstc.monash:43001 remus.dstc.monash:40600 TIME_WAIT   
tcp0  0 remus.dstc.monash:43002 remus.dstc.monash:41624 TIME_WAIT   
tcp0  0 remus.dstc.monash:43002 remus.dstc.monash:39000 TIME_WAIT 

Now could someone explain to me what this really means and what effect
it might have on the machine (the same machine where I ran this
query)? Would there eventually be a shortage of available ports if
this kept growing? The reason I am asking this is because one of my
modules was raising exception saying that TCP connection could not be
establish to a server it needed to connect to. This may sound
confusing so I'll try to explain this.

We have this scenario, there is a PGSQL server (postmaster) which is
running on machine A. Then there is a custom server called DBServer
which is running on machine B. This server accepts connections from a
client called an Agent. Agent may ran on any machine out there and it
would connect back to DBServer asking for some information. The
communication between these two is in the form of SQL queries. When
agent sends a query to DBServer it passes that query to machine A
postmaster and then passes back the result of the query to that Agent.
The connection problem I mentioned in the paragraph above happens when
Agent tries to connect to DBServer.

So the only question I have here is would those lingering socket
connections above have any effect on the problem I am having. If not I
am sorry for bothering you all with this, if yes I would like to know
what I  could do to avoid that.

Any help would be appreciated,
Regards,
Slavisa

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] [NOVICE] Many connections lingering

2005-04-12 Thread Tom Lane
Slavisa Garic [EMAIL PROTECTED] writes:
 ... Now, the
 interesting behaviour is this. I've ran netstat on the machine where
 my software is running and I searched for tcp connections to my PGSQL
 server. What i found was hundreds of lines like this:

 tcp0  0 remus.dstc.monash:43001 remus.dstc.monash:39504 TIME_WAIT
 tcp0  0 remus.dstc.monash:43001 remus.dstc.monash:40720 TIME_WAIT
 tcp0  0 remus.dstc.monash:43001 remus.dstc.monash:39135 TIME_WAIT

This is a network-level issue: the TCP stack on your machine knows the
connection has been closed, but it hasn't seen an acknowledgement of
that fact from the other machine, and so it's remembering the connection
number so that it can definitively say that connection is closed if
the other machine asks.  I'd guess that either you have a flaky network
or there's something bogus about the TCP stack on the client machine.
An occasional dropped FIN packet is no surprise, but hundreds of 'em
are suspicious.

 Now could someone explain to me what this really means and what effect
 it might have on the machine (the same machine where I ran this
 query)? Would there eventually be a shortage of available ports if
 this kept growing? The reason I am asking this is because one of my
 modules was raising exception saying that TCP connection could not be
 establish to a server it needed to connect to.

That kinda sounds like flaky network to me, but I could be wrong.
In any case, you'd have better luck asking kernel or network hackers
about this than database weenies ;-)

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] [NOVICE] Many connections lingering

2005-04-12 Thread Greg Stark

Tom Lane [EMAIL PROTECTED] writes:

 Slavisa Garic [EMAIL PROTECTED] writes:
  ... Now, the
  interesting behaviour is this. I've ran netstat on the machine where
  my software is running and I searched for tcp connections to my PGSQL
  server. What i found was hundreds of lines like this:
 
  tcp0  0 remus.dstc.monash:43001 remus.dstc.monash:39504 
  TIME_WAIT
  tcp0  0 remus.dstc.monash:43001 remus.dstc.monash:40720 
  TIME_WAIT
  tcp0  0 remus.dstc.monash:43001 remus.dstc.monash:39135 
  TIME_WAIT
 
 This is a network-level issue: the TCP stack on your machine knows the
 connection has been closed, but it hasn't seen an acknowledgement of
 that fact from the other machine, and so it's remembering the connection
 number so that it can definitively say that connection is closed if
 the other machine asks.  I'd guess that either you have a flaky network
 or there's something bogus about the TCP stack on the client machine.
 An occasional dropped FIN packet is no surprise, but hundreds of 'em
 are suspicious.

No, what Tom's describing is a different pair of states called FIN_WAIT_1 and
FIN_WAIT_2. TIME_WAIT isn't waiting for a packet, just a timeout. This is to
prevent any delayed packets from earlier in the connection causing problems
with a subsequent good connection. Otherwise you could get data from the old
connection mixed in the data for later ones.

  Now could someone explain to me what this really means and what effect
  it might have on the machine (the same machine where I ran this
  query)? Would there eventually be a shortage of available ports if
  this kept growing? The reason I am asking this is because one of my
  modules was raising exception saying that TCP connection could not be
  establish to a server it needed to connect to.

What it does indicate is that each query you're making is probably not just a
separate transaction but a separate TCP connection. That's probably not
necessary. If you have a single long-lived process you could just keep the TCP
connection open and issue a COMMIT after each transaction. That's what I would
recommend doing.


Unless you have thousands of these TIME_WAIT connections they probably aren't
actually directly the cause of your failure to establish connections. But yes
it can happen. 

What's more likely happening here is that you're stressing the server by
issuing so many connection attempts that you're triggering some bug, either in
the TCP stack or Postgres that is causing some connection attempts to not be
handled properly.

I'm skeptical that there's a bug in Postgres since lots of people do in fact
run web servers configured to open a new connection for every page. But this
wouldn't happen to be a Windows server would it? Perhaps the networking code
in that port doesn't do the right thing in this case? 

-- 
greg


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] [NOVICE] Many connections lingering

2005-04-12 Thread Tom Lane
Greg Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 This is a network-level issue: the TCP stack on your machine knows the
 connection has been closed, but it hasn't seen an acknowledgement of
 that fact from the other machine, and so it's remembering the connection
 number so that it can definitively say that connection is closed if
 the other machine asks.

 No, what Tom's describing is a different pair of states called FIN_WAIT_1 and
 FIN_WAIT_2. TIME_WAIT isn't waiting for a packet, just a timeout.

D'oh, obviously it's been too many years since I read Stevens ;-)

So AFAICS this status report doesn't actually indicate any problem,
other than massively profligate use of separate connections.  Greg's
correct that there's some risk of resource exhaustion at the TCP level,
but it's not very likely.  I'd be more concerned about the amount of
resources wasted in starting a separate Postgres backend for each
connection.  PG backends are fairly heavyweight objects --- if you
are at all concerned about performance, you want to get a decent number
of queries done in each connection.  Consider using a connection pooler.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] [NOVICE] Many connections lingering

2005-04-12 Thread Slavisa Garic
Hi Greg,

This is not a Windows server. Both server and client are the same
machine (done for testing purposes) and it is a Fedora RC2 machine.
This also happens on debian server and client in which case they were
two separate machines.

There are thousands (2+) of these waiting around and each one of them
dissapears after 50ish seconds. I tried psql command line and
monitored that connection in netstats. After I did a graceful exit
(\quit) the connection changed to TIME_WAIT and it was sitting there
for around 50 seconds. I thought I could do what you suggested with
having one connection and making each query a full BEGIN/QUERY/COMMIT
transaction but I thought I could avoid that :).

This is a serious problem for me as there are multiple users using our
software on our server and I would want to avoid having connections
open for a long time. In the scenario mentioned below I haven't
explained the magnitute of the communications happening between Agents
and DBServer. There could possibly be 100 or more Agents per
experiment, per user running on remote machines at the same time,
hence we need short transactions/pgsql connections. Agents need a
reliable connection because failure to connect could mean a loss of
computation results that were gathered over long periods of time.

Thanks for the help by the way :),
Regards,
Slavisa

On 12 Apr 2005 23:27:09 -0400, Greg Stark [EMAIL PROTECTED] wrote:
 
 Tom Lane [EMAIL PROTECTED] writes:
 
  Slavisa Garic [EMAIL PROTECTED] writes:
   ... Now, the
   interesting behaviour is this. I've ran netstat on the machine where
   my software is running and I searched for tcp connections to my PGSQL
   server. What i found was hundreds of lines like this:
 
   tcp0  0 remus.dstc.monash:43001 remus.dstc.monash:39504 
   TIME_WAIT
   tcp0  0 remus.dstc.monash:43001 remus.dstc.monash:40720 
   TIME_WAIT
   tcp0  0 remus.dstc.monash:43001 remus.dstc.monash:39135 
   TIME_WAIT
 
  This is a network-level issue: the TCP stack on your machine knows the
  connection has been closed, but it hasn't seen an acknowledgement of
  that fact from the other machine, and so it's remembering the connection
  number so that it can definitively say that connection is closed if
  the other machine asks.  I'd guess that either you have a flaky network
  or there's something bogus about the TCP stack on the client machine.
  An occasional dropped FIN packet is no surprise, but hundreds of 'em
  are suspicious.
 
 No, what Tom's describing is a different pair of states called FIN_WAIT_1 and
 FIN_WAIT_2. TIME_WAIT isn't waiting for a packet, just a timeout. This is to
 prevent any delayed packets from earlier in the connection causing problems
 with a subsequent good connection. Otherwise you could get data from the old
 connection mixed in the data for later ones.
 
   Now could someone explain to me what this really means and what effect
   it might have on the machine (the same machine where I ran this
   query)? Would there eventually be a shortage of available ports if
   this kept growing? The reason I am asking this is because one of my
   modules was raising exception saying that TCP connection could not be
   establish to a server it needed to connect to.
 
 What it does indicate is that each query you're making is probably not just a
 separate transaction but a separate TCP connection. That's probably not
 necessary. If you have a single long-lived process you could just keep the TCP
 connection open and issue a COMMIT after each transaction. That's what I would
 recommend doing.
 
 Unless you have thousands of these TIME_WAIT connections they probably aren't
 actually directly the cause of your failure to establish connections. But yes
 it can happen.
 
 What's more likely happening here is that you're stressing the server by
 issuing so many connection attempts that you're triggering some bug, either in
 the TCP stack or Postgres that is causing some connection attempts to not be
 handled properly.
 
 I'm skeptical that there's a bug in Postgres since lots of people do in fact
 run web servers configured to open a new connection for every page. But this
 wouldn't happen to be a Windows server would it? Perhaps the networking code
 in that port doesn't do the right thing in this case?
 
 --
 greg
 


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly