Re: [PERFORM] insert/update tps slow with indices on table 1M rows

2008-06-04 Thread Matthew Wakeling

On Tue, 3 Jun 2008, andrew klassen wrote:
Basically, I have a somewhat constant rate of inserts/updates that go 
into a work queue and then get passed to postgres.



The cpu load is not that high, i.e. plenty of idle cpu. I am running an older
version of freebsd and the iostat output is not very detailed.


If you're running a work queue architecture, that probably means you 
only have one thread doing all the updates/inserts? It might be worth 
going multi-threaded, and issuing inserts and updates through more than 
one connection. Postgres is designed pretty well to scale performance by 
the number of simultaneous connections.


Matthew

--
Contrary to popular belief, Unix is user friendly. It just happens to be
very selective about who its friends are. -- Kyle Hearn
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] RAM / Disk ratio, any rule?

2008-06-04 Thread Mathieu Gilardet

hello,

I am wondering if actually, does exists any rule to determine
the amount of RAM, according to the hardware disk size
(in prevision to use all or nearly available space),
when designing a server?

Of course fsm settings implies that when db size grows, more memory will 
be in use

for tracking free space.
Same thing applies from RAM cache effectiveness: the bigger db size is, 
the more RAM is needed.


Do be more specific, we have an heavy loaded server with SCSI disks 
(RAID 0 on a SAS controller),
making a total of 500GB. Actually, there are 16GB RAM, representing 
about 2,5% of db size.


any experiences to share?

in advance, thank you.


Mathieu




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] backend pid changing

2008-06-04 Thread Lewis Kapell
I have a Windows application which connects to a Postgres (8.3) database 
residing on our company server.  Most of the application's users work 
from their homes, so the application was developed with a lot of 
security checks.


When a client connects to the database, a random hash is generated and 
sent to the client; this hash is also saved in a Postgres table along 
with the user id and the return value of pg_backend_pid().  When the 
client submits queries, it presents its hash value and the server 
cross-checks this, and the current value of pg_backend_pid(), against 
the values that were stored previously.


If there is a mismatch, the client is instructed to obtain a new hash 
and begin again.  The information about the mismatch is also recorded 
for future inspection.  By examining the logs, I have observed that the 
backend pid for a particular client sometimes changes during a session. 
 This seems to happen about a dozen times a day, total.  Usually this 
is not a problem, as the client will get a new hash and keep going.


Sometimes, however, this seems to happen in the middle of an operation. 
 This happens when the client has sent a large chunk of data that is to 
be stored in the database.  The client sends its authorization 
information immediately before sending the data, and also with the data 
chunk.  On rare occasions, the backend pid somehow seems to change 
during the time it takes for the data to be sent.  This causes errors 
and loss of time for the user.


I'm sure there are more details that would be needed to give a complete 
picture of what is going on, yet this message is pretty long already.  I 
am going to stop here and ask whether anyone can make sense of this. 
That is, make sense of what I have written, and also of why the backend 
pid would change during an operation as I have described.  Thanks to any 
who can offer information on this.


Lewis

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] insert/update tps slow with indices on table 1M rows

2008-06-04 Thread andrew klassen
I am using multiple threads, but only one worker thread for insert/updated to 
this table.
I don't mind trying to add multiple threads for this table, but my guess is it 
would not 
help because basically the overall tps rate is decreasing so dramatically. Since
the cpu time consumed by the corresponding postgres server process for my 
thread is
small it does not seem to be the bottleneck. There has to be a bottleneck 
somewhere else. 
Do you agree or is there some flaw in my reasoning?

- Original Message 
From: Matthew Wakeling [EMAIL PROTECTED]
To: andrew klassen [EMAIL PROTECTED]
Cc: pgsql-performance@postgresql.org
Sent: Wednesday, June 4, 2008 5:31:22 AM
Subject: Re: [PERFORM] insert/update tps slow with indices on table  1M rows

On Tue, 3 Jun 2008, andrew klassen wrote:
 Basically, I have a somewhat constant rate of inserts/updates that go 
 into a work queue and then get passed to postgres.

 The cpu load is not that high, i.e. plenty of idle cpu. I am running an older
 version of freebsd and the iostat output is not very detailed.

If you're running a work queue architecture, that probably means you 
only have one thread doing all the updates/inserts? It might be worth 
going multi-threaded, and issuing inserts and updates through more than 
one connection. Postgres is designed pretty well to scale performance by 
the number of simultaneous connections.

Matthew

-- 
Contrary to popular belief, Unix is user friendly. It just happens to be
very selective about who its friends are.                -- Kyle Hearn
-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



  

Re: [PERFORM] backend pid changing

2008-06-04 Thread Matthew Wakeling

On Wed, 4 Jun 2008, Lewis Kapell wrote:
The client sends its authorization information immediately before 
sending the data, and also with the data chunk.


Well, I have no idea why the backend pid is changing, but here it looks 
like you have a classic concurrency problem caused by checking a variable 
twice. It seems you have client-side error recovery on the initial check, 
but not on the second check. The solution is to eliminate the first check, 
and implement proper error recovery on the second check, so that the 
client can just get a new hash and try again.


Matthew

--
It's one of those irregular verbs - I have an independent mind, You are
an eccentric, He is round the twist.
 -- Bernard Woolly, Yes Prime Minister

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] backend pid changing

2008-06-04 Thread Tom Lane
Lewis Kapell [EMAIL PROTECTED] writes:
 ... By examining the logs, I have observed that the 
 backend pid for a particular client sometimes changes during a session. 

That is just about impossible to believe, unless perhaps you have a
connection pooler in the loop somewhere?

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] PgPool parallel query performance rules of thumb

2008-06-04 Thread John Beaver

Hi,
   I'm trying to make use of a cluster of 40 nodes that my group has, 
and I'm curious if anyone has experience with PgPool's parallel query 
mode. Under what circumstances could I expect the most benefit from 
query parallelization as implemented by PgPool?


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] insert/update tps slow with indices on table 1M rows

2008-06-04 Thread andrew klassen

I agree that the discs are probably very busy. I do have 2 disks but they are
for redundancy. Would it help to put the data, indexes and xlog on separate 
disk partitions? 
I'll try adding more threads to update the table as you suggest.

- Original Message 
From: Matthew Wakeling [EMAIL PROTECTED]
To: pgsql-performance@postgresql.org
Sent: Wednesday, June 4, 2008 10:10:38 AM
Subject: Re: [PERFORM] insert/update tps slow with indices on table  1M rows

On Wed, 4 Jun 2008, andrew klassen wrote:
 I am using multiple threads, but only one worker thread for insert/updated to 
 this table.
 I don't mind trying to add multiple threads for this table, but my guess is 
 it would not
 help because basically the overall tps rate is decreasing so dramatically. 
 Since
 the cpu time consumed by the corresponding postgres server process for my 
 thread is
 small it does not seem to be the bottleneck. There has to be a bottleneck 
 somewhere else.
 Do you agree or is there some flaw in my reasoning?

There is indeed a flaw in your reasoning - there may be very little CPU 
time consumed, but that just indicates that the discs are busy. Getting 
Postgres to do multiple things at once will cause a more efficient use of 
the disc subsystem, resulting in greater overall throughput. This is 
especially the case if you have multiple discs in your box.

Matthew

-- 
Contrary to popular belief, Unix is user friendly. It just happens to be
very selective about who its friends are.                -- Kyle Hearn
-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



  

Re: [PERFORM] insert/update tps slow with indices on table 1M rows

2008-06-04 Thread James Mansion

Matthew Wakeling wrote:
If you're running a work queue architecture, that probably means you 
only have one thread doing all the updates/inserts? It might be worth 
going multi-threaded, and issuing inserts and updates through more 
than one connection. Postgres is designed pretty well to scale 
performance by the number of simultaneous connections.
That would explain a disappointing upper limit on insert rate, but not 
any sort of cliff for the rate.  Nor, really, any material slowdown, if 
the single thread implies that we're stuck on round trip latency as a 
material limiting factor.


James


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] insert/update tps slow with indices on table 1M rows

2008-06-04 Thread James Mansion

andrew klassen wrote:

I'll try adding more threads to update the table as you suggest.
You could try materially increasing the update batch size too.  As an 
exercise you could
see what the performance of COPY is by backing out the data and 
reloading it from

a suitable file.


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] insert/update tps slow with indices on table 1M rows

2008-06-04 Thread andrew klassen
I am using the c-library interface and for these particular transactions
I preload PREPARE statements. Then as I get requests, I issue a BEGIN, 
followed by at most 300 EXECUTES and then a COMMIT. That is the
general scenario. What value beyond 300 should I try? 
Also, how might COPY (which involves file I/O) improve the 
above scenario? 
Thanks.


- Original Message 
From: James Mansion [EMAIL PROTECTED]
To: andrew klassen [EMAIL PROTECTED]
Cc: pgsql-performance@postgresql.org
Sent: Wednesday, June 4, 2008 3:20:26 PM
Subject: Re: [PERFORM] insert/update tps slow with indices on table  1M rows

andrew klassen wrote:
 I'll try adding more threads to update the table as you suggest.
You could try materially increasing the update batch size too.  As an 
exercise you could
see what the performance of COPY is by backing out the data and 
reloading it from
a suitable file.