We have a network application in which many clients will be executing a mix of 
select/insert/update/deletes on a central postgres 7.4.5 database, running on 
Solaris 9 running on dual 2.3 ghz Xeons, with 2 gig of RAM and a RAID 10 disk. 
The test database is about 400 meg in size.

We have tuned the postgresql.conf parameters to the point where we are 
confident we have enough memory for shared buffers and for sorting. We are 
still tuning SQL statements, but we're pretty sure the big wins have been 
achieved.

We are maxing out on the backend with 30 postmaster processes, each taking up 
about 2.5-3% of the CPU. We have tested mounting the whole database in /tmp, 
hence in memory, and it has made no difference in performance, so it seems we 
are purely CPU bound at this point.

About 70% of our time is spent in selects, and another 25% spent in 
inserts/updates of a single table (about 10% out of the selects % is against 
this table).

Now, our application client is not doing nearly enough of it's own caching, so 
a lot the work the database is doing currently is redundant, and we are working 
on the client, but in the meantime we have to squeeze as much as we can from 
the backend.

After that long intro, I have a couple of questions:

1) Given that the data is all cached, what can we do to make sure that postgres 
is generating
the most efficient plans in this case? We have bumped up effective_cache_size, 
but it had no
effect. Also, what would the most efficient plan for in-memory data look like? 
I mean, does one
still look for the normal stuff - index usage, etc., or are seqscans what we 
should be looking for?
I've seen some stuff about updating statistics targets for specific tables, but 
I'm not sure I 
understand it, and don't know if something like that applies in this case. I 
can supply some specific plans, if that would help (this email is already too 
long...).

2) We have SQL test environment where we just run the SQL statements executed 
by the clients (culled from the log file) in psql. In our test environment, the 
same set of SQL statements runs 4X faster that the times achieved in the test 
that generated our source log file. Obviously there was a bigger load on the 
machine in the full test, but I'm wondering if there are any particular 
diagnostics that I should be looking at to ferret out contention. I haven't 
seen anything that looked suspicious in pg_locks, but it's difficult to 
interpret that data when the database is under load (at least for someone of my 
limited experience).

I suspect the ultimate answer to our problem will be:

   1) aggressive client-side caching
   2) SQL tuning
   3) more backend hardware

But I would grateful to hear any tips/anecdotes/experiences that others might 
have from tuning similar applications.

Thanks!

- DAP
----------------------------------------------------------------------------------
David Parker    Tazz Networks    (401) 709-5130
 

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Reply via email to