Re: [PERFORM] 500 requests per second

2007-05-15 Thread Tarhon-Onu Victor

On Mon, 14 May 2007, Richard Huxton wrote:

1. Is this one client making 500 requests, or 500 clients making one request 
per second?


Up to 250 clients will make up to 500 requests per second.


2. Do you expect the indexes at least to fit in RAM?


not entirely... or not all of them.

--

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] 500 requests per second

2007-05-15 Thread Richard Huxton

Tarhon-Onu Victor wrote:

On Mon, 14 May 2007, Richard Huxton wrote:

1. Is this one client making 500 requests, or 500 clients making one 
request per second?


Up to 250 clients will make up to 500 requests per second.


Well, PG is pretty good at handling multiple clients. But if I'm 
understanding you here, you're talking about potentially 250*500=125000 
updates per second. If each update writes 1KB to disk, that's 125MB/sec 
continuously written. Are these figures roughly correct?



2. Do you expect the indexes at least to fit in RAM?


not entirely... or not all of them.


Hmm - OK. So you're going to have index reads accessing disk as well. 
Exactly how big are you looking at here?

Will it be constantly growing?
Can you partition the large table(s) by date or similar?

Well, the two things I'd expect to slow you down are:
1. Contention updating index blocks
2. Disk I/O as you balance updates and selects.

Since you're constantly updating, you'll want to have WAL on a separate 
set of disks from the rest of your database, battery-backed cache on 
your raid controller etc. Check the mailing list archives for recent 
discussions about good/bad controllers. You'll also want to 
substantially increase checkpoint limits, of course.


If you can cope with the fact that there's a delay, you might want to 
look at replication (e.g. slony) to have reads happening on a separate 
machine from writes. That may well not be possible in your case.


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PERFORM] Many to many join seems slow?

2007-05-15 Thread Drew Wilson
I'm trying to debug a query that gets all the French translations for  
all US string values. Ultimately, my goal is to rank them all by edit  
distance, and only pick the top N.


However, I cannot get the basic many-to-many join to return all the  
results in less than 3 seconds, which seems slow to me. (My  
competition is an in-memory perl hash that runs on client machines  
providing results in around 3 seconds, after a 30 second startup time.)


The simplified schema is :
source - translation_pair - translation

The keys are all sequence generated oids. I do wonder if the  
performance would be better if I used the string values as keys to  
get better data distribution. Would this help speed up performance?


There are 159283 rows in source
There are 1723935 rows in translation, of which 159686 are French

=# explain SELECT s.source_id, s.value AS sourceValue, t.value AS  
translationValue

  FROM
  source s,
  translation_pair tp,
  translation t,
  language l
  WHERE
  s.source_id = tp.source_id
  AND tp.translation_id = t.translation_id
  AND t.language_id = l.language_id
  AND l.name = 'French' ;

 QUERY PLAN
 
-

Merge Join  (cost=524224.49..732216.29 rows=92447 width=97)
   Merge Cond: (tp.source_id = s.source_id)
   -  Sort  (cost=524224.49..524455.60 rows=92447 width=55)
 Sort Key: tp.source_id
 -  Nested Loop  (cost=1794.69..516599.30 rows=92447 width=55)
   -  Nested Loop  (cost=1794.69..27087.87 rows=86197  
width=55)
 -  Index Scan using language_name_key on  
language l  (cost=0.00..8.27 rows=1 width=4)

   Index Cond: ((name)::text = 'French'::text)
 -  Bitmap Heap Scan on translation t   
(cost=1794.69..25882.43 rows=95774 width=59)
   Recheck Cond: (t.language_id =  
l.language_id)
   -  Bitmap Index Scan on  
translation_language_l_key  (cost=0.00..1770.74 rows=95774 width=0)
 Index Cond: (t.language_id =  
l.language_id)
   -  Index Scan using translation_pair_translation_id  
on translation_pair tp  (cost=0.00..5.67 rows=1 width=8)

 Index Cond: (tp.translation_id = t.translation_id)
   -  Index Scan using source_pkey on source s   
(cost=0.00..206227.65 rows=159283 width=46)

(15 rows)

I'm running Postgres 8.2.3 on latest Mac OSX 10.4.x. The CPU is a  
3Ghz Dual-Core Intel Xeon, w/ 5G ram. The drive is very fast although  
I don't know the configuration (I think its an XRaid w/ 3 SAS/SCSI  
70G Seagate drives).


The regular performance configurable values are:
work_mem   32MB
shared_buffers 32MB
max_fsm_pages  204800
max_fsm_relations  1000


Thanks for any advice,

Drew

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Many to many join seems slow?

2007-05-15 Thread Alvaro Herrera
Drew Wilson escribió:

 =# explain SELECT s.source_id, s.value AS sourceValue, t.value AS  
 translationValue
   FROM
   source s,
   translation_pair tp,
   translation t,
   language l
   WHERE
   s.source_id = tp.source_id
   AND tp.translation_id = t.translation_id
   AND t.language_id = l.language_id
   AND l.name = 'French' ;

Please provide an EXPLAIN ANALYZE of the query.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Many to many join seems slow?

2007-05-15 Thread Heikki Linnakangas

Drew Wilson wrote:
Merge Join  (cost=524224.49..732216.29 rows=92447 width=97) (actual 
time=1088.871..1351.840 rows=170759 loops=1)

...
Total runtime: 1366.757 ms


It looks like the query actual runs in less than 3 seconds, but it takes 
some time to fetch 170759 rows to the client.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


[PERFORM] bitmap index and IS NULL predicate

2007-05-15 Thread Jason Pinnix

Hello,

I'm running version 8.2 with the bitmap index patch posted on pgsql-hackers. 
While selection queries with equality predicates (col = value) are able to make 
use of the bitmap index, those with IS NULL predicates (col IS NULL) are not 
able to use the bitmap index. The online manuals seem to indicate that IS NULL 
predicates by default do not use indices but they can be forced to do so by 
setting enable_seqscan to off. Even after setting enable_seqscan to off, the 
optimizer still chooses sequential scan over bitmap  index scan. Below shows 
various queries with plans showing use (and lack of) the bitmap index on a 
table containing 1500 rows. 

I also checked that if I create a btree index on col and set enable_seqscan to 
off, the optimizer correctly chooses the btree index for IS NULL queries. So my 
question is whether there is something fundamentally different about the bitmap 
index that precludes its use in IS NULL queries? Does the bitmap index not 
store a bit vector for the NULL value (i.e. a bit vector that contains a 1 for 
each row with a NULL value and 0 for other rows) ? 

Thanks,
Jason

my_db=# explain analyze select * from some_values where col=98;
   QUERY PLAN   
 
-
 Bitmap Heap Scan on some_values  (cost=5.01..94.42 rows=97 width=8) (actual 
time=0.493..0.923 rows=100 loops=1)
   Recheck Cond: (col = 98)
   -  Bitmap Index Scan on some_values_idx  (cost=0.00..4.98 rows=97 width=0) 
(actual time=0.475..0.475 rows=0 loops=1)
 Index Cond: (col = 98)
 Total runtime: 1.321 ms
(5 rows)

my_db=# explain analyze select * from some_values where col is null;
  QUERY PLAN
   
---
 Seq Scan on some_values  (cost=0.00..184.00 rows=1 width=8) (actual 
time=0.102..1.966 rows=1 loops=1)
   Filter: (col IS NULL)
 Total runtime: 2.014 ms
(3 rows)

my_db=# set enable_seqscan to off;
SET
my_db=# explain analyze select * from some_values where col is null;
 QUERY PLAN 
 
-
 Seq Scan on some_values  (cost=1.00..10184.00 rows=1 width=8) 
(actual time=0.100..1.934 rows=1 loops=1)
   Filter: (col IS NULL)
 Total runtime: 1.976 ms
(3 rows)

   
-
Luggage? GPS? Comic books? 
Check out fitting  gifts for grads at Yahoo! Search.

Re: [PERFORM] bitmap index and IS NULL predicate

2007-05-15 Thread Alexander Staubo

On 5/15/07, Jason Pinnix [EMAIL PROTECTED] wrote:

Does the bitmap
index not store a bit vector for the NULL value (i.e. a bit vector that
contains a 1 for each row with a NULL value and 0 for other rows) ?


You should be able to do this with a conditional index:

 create index ... (col) where col is null;

Alexander.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Many to many join seems slow?

2007-05-15 Thread Daniel Cristian Cruz

2007/5/15, Drew Wilson [EMAIL PROTECTED]:

=# explain SELECT s.source_id, s.value AS sourceValue, t.value AS
translationValue
   FROM
   source s,
   translation_pair tp,
   translation t,
   language l
   WHERE
   s.source_id = tp.source_id
   AND tp.translation_id = t.translation_id
   AND t.language_id = l.language_id
   AND l.name = 'French' ;

  QUERY PLAN

-
Merge Join  (cost=524224.49..732216.29 rows=92447 width=97)


This way you get all word matches for the French language. Shouldn't
it be all matches for a specific word (s.value = 'word' in WHERE)?

--
Daniel Cristian Cruz

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[PERFORM] [doc patch] a slight VACUUM / VACUUM FULL doc improvement proposal

2007-05-15 Thread Guillaume Cottenceau
Dear all,

After some time spent better understanding how the VACUUM process
works, what problems we had in production and how to improve our
maintenance policy[1], I've come up with a little documentation
patch - basically, I think the documentation under estimates (or
sometimes misses) the benefit of VACUUM FULL for scans, and the
needs of VACUUM FULL if the routine VACUUM hasn't been done
properly since the database was put in production. Find the patch
against snapshot attached (text not filled, to ease reading). It
might help others in my situation in the future.

--- doc/src/sgml/ref/vacuum.sgml	2007-02-01 00:26:04.0 +0100
+++ /tmp/vacuum.sgml	2007-05-15 18:32:14.0 +0200
@@ -164,8 +164,8 @@
para
 The optionFULL/option option is not recommended for routine use,
 but might be useful in special cases.  An example is when you have deleted
-most of the rows in a table and would like the table to physically shrink
-to occupy less disk space.  commandVACUUM FULL/command will usually
+or updated most of the rows in a table and would like the table to physically shrink
+to occupy less disk space and allow faster table scans. commandVACUUM FULL/command will usually
 shrink the table more than a plain commandVACUUM/command would.
 The optionFULL/option option does not shrink indexes; a periodic
 commandREINDEX/ is still recommended.  In fact, it is often faster
--- doc/src/sgml/maintenance.sgml	2007-05-03 17:47:48.0 +0200
+++ /tmp/maintenance.sgml	2007-05-15 18:29:29.0 +0200
@@ -157,7 +157,8 @@
 command. This uses a more aggressive algorithm for reclaiming the
 space consumed by dead row versions. Any space that is freed by
 commandVACUUM FULL/command is immediately returned to the
-operating system. Unfortunately, this variant of the
+operating system, and the table data is physically compacted on
+the disk. Unfortunately, this variant of the
 commandVACUUM/command command acquires an exclusive lock on
 each table while commandVACUUM FULL/command is processing
 it. Therefore, frequently using commandVACUUM FULL/command can
@@ -168,12 +169,15 @@
para
 The standard form of commandVACUUM/ is best used with the goal
 of maintaining a fairly level steady-state usage of disk space. If
-you need to return disk space to the operating system you can use
+you need to return disk space to the operating system, you can use
 commandVACUUM FULL/ mdash; but what's the point of releasing disk
 space that will only have to be allocated again soon?  Moderately
 frequent standard commandVACUUM/ runs are a better approach
 than infrequent commandVACUUM FULL/ runs for maintaining
-heavily-updated tables.
+heavily-updated tables. However, if some heavily-updated tables
+have gone too long with infrequent commandVACUUM/, you can
+use commandVACUUM FULL/ to get performance back (it is much
+slower to scan a table containing almost only dead rows).
/para
 
para

Ref: 
[1] http://archives.postgresql.org/pgsql-performance/2006-08/msg00419.php
http://archives.postgresql.org/pgsql-performance/2007-05/msg00112.php

-- 
Guillaume Cottenceau, MNC Mobile News Channel SA, an Alcatel-Lucent Company
Av. de la Gare 10, 1003 Lausanne, Switzerland - direct +41 21 317 50 36

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Many to many join seems slow?

2007-05-15 Thread Drew Wilson
Yes, I'll be filtering by string value. However, I just wanted to see  
how long it takes to scan all translations in a particular language.


Drew

On May 15, 2007, at 9:00 AM, Daniel Cristian Cruz wrote:


2007/5/15, Drew Wilson [EMAIL PROTECTED]:

=# explain SELECT s.source_id, s.value AS sourceValue, t.value AS
translationValue
   FROM
   source s,
   translation_pair tp,
   translation t,
   language l
   WHERE
   s.source_id = tp.source_id
   AND tp.translation_id = t.translation_id
   AND t.language_id = l.language_id
   AND l.name = 'French' ;

  QUERY PLAN
- 
---

-
Merge Join  (cost=524224.49..732216.29 rows=92447 width=97)


This way you get all word matches for the French language. Shouldn't
it be all matches for a specific word (s.value = 'word' in WHERE)?

--
Daniel Cristian Cruz



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Many to many join seems slow?

2007-05-15 Thread Drew Wilson
You're right. If I redirect output to /dev/null, the query completes  
in 1.4s.


# \o /dev/null
# SELECT s.source_id, s.value as sourceValue, t.value as  
translationValue...

...
Time: 1409.557 ms
#

That'll do for now.

Thanks,

Drew

On May 15, 2007, at 7:17 AM, Heikki Linnakangas wrote:


Drew Wilson wrote:
Merge Join  (cost=524224.49..732216.29 rows=92447 width=97)  
(actual time=1088.871..1351.840 rows=170759 loops=1)

...
Total runtime: 1366.757 ms


It looks like the query actual runs in less than 3 seconds, but it  
takes some time to fetch 170759 rows to the client.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] [doc patch] a slight VACUUM / VACUUM FULL doc improvement proposal

2007-05-15 Thread Michael Stone

On Tue, May 15, 2007 at 06:43:50PM +0200, Guillaume Cottenceau wrote:

patch - basically, I think the documentation under estimates (or
sometimes misses) the benefit of VACUUM FULL for scans, and the
needs of VACUUM FULL if the routine VACUUM hasn't been done
properly since the database was put in production.


It's also possible to overestimate the benefit of vacuum full, leading 
to people vacuum full'ing almost constantly, then complaining about 
performance due to the associated overhead. I think there have been more 
people on this list whose performance problems were caused by 
unnecessary full vacs than by those whose performance problems were 
caused by insufficient full vacs.


Mike Stone

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[PERFORM] Disk Fills Up and fsck Compresses it

2007-05-15 Thread Y Sidhu

Anyone seen PG filling up a 66 GB partition from say 40-ish percentage to
60-ish percentage in a manner of minutes. When I run a 'fsck' the disk usage
comes down to 40-ish percentage. That's about 10+ GB's variance.

This is a FreeBSD 6.2 RC2, 4GB memory, Xeon 3.2 GHz '4' of the '8' CPUs in
use - dual cpu, dual core with HTT turned off in the sense that the other 4
cpu's have been masked out. The drive is a Western Digital 70 GB SATA.

--
Yudhvir Singh Sidhu
408 375 3134 cell


[PERFORM] How to Run a pg_stats Query

2007-05-15 Thread Y Sidhu

I turned on all the stats in the conf file (below) and restarted the server.
Question is, what's the name of the database and how do I run a simple
select query?

stats_start_collector = true
stats_command_string = true
stats_block_level = true
stats_row_level = true
stats_reset_on_server_start = true

--
Yudhvir Singh Sidhu
408 375 3134 cell


Re: [PERFORM] How to Run a pg_stats Query

2007-05-15 Thread Alvaro Herrera
Y Sidhu escribió:
 I turned on all the stats in the conf file (below) and restarted the server.
 Question is, what's the name of the database and how do I run a simple
 select query?
 
 stats_start_collector = true
 stats_command_string = true
 stats_block_level = true
 stats_row_level = true
 stats_reset_on_server_start = true

Stats are present on all databases.  As for the name of the tables, try
pg_stat_user_tables and pg_stat_activity for starters.  There are a lot
more; check the documentation or a \d pg_stat* in psql.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] 500 requests per second

2007-05-15 Thread Jim C. Nasby
On Tue, May 15, 2007 at 11:47:29AM +0100, Richard Huxton wrote:
 Tarhon-Onu Victor wrote:
 On Mon, 14 May 2007, Richard Huxton wrote:
 
 1. Is this one client making 500 requests, or 500 clients making one 
 request per second?
 
 Up to 250 clients will make up to 500 requests per second.
 
 Well, PG is pretty good at handling multiple clients. But if I'm 
 understanding you here, you're talking about potentially 250*500=125000 
 updates per second. If each update writes 1KB to disk, that's 125MB/sec 
 continuously written. Are these figures roughly correct?
 
I'm guessing it's 500TPS overall, not per connection. It'd be rather
challenging just to do 125,000 network round trips per second.

 2. Do you expect the indexes at least to fit in RAM?
 
 not entirely... or not all of them.
 
 Hmm - OK. So you're going to have index reads accessing disk as well. 
 Exactly how big are you looking at here?
 Will it be constantly growing?
 Can you partition the large table(s) by date or similar?
 
 Well, the two things I'd expect to slow you down are:
 1. Contention updating index blocks
 2. Disk I/O as you balance updates and selects.
 
 Since you're constantly updating, you'll want to have WAL on a separate 
 set of disks from the rest of your database, battery-backed cache on 
 your raid controller etc. Check the mailing list archives for recent 
 discussions about good/bad controllers. You'll also want to 
 substantially increase checkpoint limits, of course.
 
 If you can cope with the fact that there's a delay, you might want to 
 look at replication (e.g. slony) to have reads happening on a separate 
 machine from writes. That may well not be possible in your case.

Just as a data point, I've worked with some folks that are doing ~250TPS
on a disk array with around 20-30 drives. IIRC a good amount of their
working set did fit into memory, but not all of it.

Your biggest constraint is really going to be I/O operations per second.
If 90% of your data is in cache then you'll need to do a minimum of
50IOPS (realistically you'd probably have to double that). If 50% of
your working set fits in cache you'd then be looking at 250IOPS, which
is a pretty serious rate.

I very strongly encourage you to do benchmarking to get a feel for how
your system performs on a given set of hardware so that you have some
idea of where you need to get to. You should also be looking hard at
your application and system architecture for ways to cut down on your
throughput. There may be some things you can do that would reduce the
amount of database hardware you need to buy.
-- 
Jim Nasby  [EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] pg_stats how-to?

2007-05-15 Thread Jim C. Nasby
On Mon, May 14, 2007 at 08:20:49PM -0400, Tom Lane wrote:
 Y Sidhu [EMAIL PROTECTED] writes:
  it may be table fragmentation. What kind of tables? We have 2 of them which
  experience lots of adds and deletes only. No updates. So a typical day
  experiences record adds a few dozen times on the order of 2.5 million. And
  deletes once daily. Each of these tables has about 3 btree indexes.
 
 With an arrangement like that you should vacuum once daily, shortly
 after the deletes --- there's really no point in doing it on any other
 schedule.  Note shortly not immediately --- you want to be sure that
 any transactions old enough to see the deleted rows have ended.

Also, think about ways you might avoid the deletes altogether. Could you
do a truncate instead? Could you use partitioning? If you are using
deletes then look at CLUSTERing the table some time after the deletes
(but be aware that prior to 8.3 CLUSTER doesn't fully obey MVCC).

To answer your original question, a way to take a look at how bloated
your tables are would be to ANALYZE, divide reltuples by relpages from
pg_class (gives how many rows per page you have) and compare that to 8k
/ average row size. The average row size for table rows would be the sum
of avg_width from pg_stats for the table + 24 bytes overhead. For
indexes, it would be the sum of avg_width for all fields in the index
plus some overhead (8 bytes, I think).

An even simpler alternative would be to install contrib/pgstattuple and
use the pgstattuple function, though IIRC that does read the entire
relation from disk.
-- 
Jim Nasby  [EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Disk Fills Up and fsck Compresses it

2007-05-15 Thread Jim C. Nasby
I'm guessing you're seeing the affect of softupdates. With those enabled
it can take some time before the space freed by a delete will actually
show up as available.

On Tue, May 15, 2007 at 01:18:42PM -0700, Y Sidhu wrote:
 Anyone seen PG filling up a 66 GB partition from say 40-ish percentage to
 60-ish percentage in a manner of minutes. When I run a 'fsck' the disk usage
 comes down to 40-ish percentage. That's about 10+ GB's variance.
 
 This is a FreeBSD 6.2 RC2, 4GB memory, Xeon 3.2 GHz '4' of the '8' CPUs in
 use - dual cpu, dual core with HTT turned off in the sense that the other 4
 cpu's have been masked out. The drive is a Western Digital 70 GB SATA.
 
 -- 
 Yudhvir Singh Sidhu
 408 375 3134 cell

-- 
Jim Nasby  [EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] [PATCHES] Automatic adjustment of bgwriter_lru_maxpages

2007-05-15 Thread Greg Smith

On Tue, 15 May 2007, Jim C. Nasby wrote:


Moving to -performance.


No, really, moved to performance now.


On Mon, May 14, 2007 at 09:55:16PM -0700, daveg wrote:

What is the current thinking on bg_writer setttings for systems such as
4 core Opteron with 16GB or 32GB of memory and heavy batch workloads?


First off, the primary purpose of both background writers are to keep the 
individual client backends from stalling to wait for disk I/O.  If you're 
running a batch workload, and there isn't a specific person waiting for a 
response, the background writer isn't as critical to worry about.


As Jim already said, tuning the background writer well really requires a 
look at the usage profile of your buffer pool and some thinking about your 
I/O capacity just as much as it does your CPU/memory situation.


For the first part, I submitted a patch that updates the 
contrib/pg_buffercache module to show the usage count information of your 
buffer cache.  The LRU writer only writes things with a usage_count of 0, 
so taking some snapshots of that data regularly will give you an idea 
whether you can useful use it or whether you'd be better off making the 
all scan more aggressive.  It's a simple patch that only effects a contrib 
module you can add and remove easily, I would characterize it as pretty 
safe to apply even to a production system as long as you're doing the 
initial tests off-hours.  The patch is at


http://archives.postgresql.org/pgsql-patches/2007-03/msg00555.php

And the usual summary query I run after installing it in a database is:

select usagecount,count(*),isdirty from pg_buffercache group by 
isdirty,usagecount order by isdirty,usagecount;


As for the I/O side of things, I'd suggest you compute a worst-case 
scenario for how many disk writes will happen if every buffer the 
background writer comes across is dirty and base your settings on what 
you're comfortable with there.  Say you kept the default interval of 200ms 
but increased the maximum pages value to 1000; each writer could 
theoretically push 1000 x 8KB x 5/second = 40MB/s worth of data to disk. 
Since these are database writes that have to be interleaved with reads, 
the sustainable rate here is not as high as you might think.  You might 
get a useful performance boost just pushing the max numbers from the 
defaults to up into the couple of hundred range--with the amount of RAM 
you probably have decided to the buffer cache even the default small 
percentages will cover a lot of ground and might need to be increased.  I 
like 250 as a round number because it makes for at most an even 10MB a 
second flow out per writer.  I wouldn't go too high on the max writes per 
pass unless you're in a position to run some good tests to confirm you're 
not actually making things worse.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


[PERFORM] New performance documentation released

2007-05-15 Thread Greg Smith
I've been taking notes on what people ask about on this list, mixed that 
up with work I've been doing lately, and wrote some documentation readers 
of this mailing list may find useful.  There are a series of articles now 
at http://www.westnet.com/~gsmith/content/postgresql/ about performance 
testing and tuning.


The 5-minute Introduction to PostgreSQL Performance and the Disk 
performance testing articles were aimed to be FAQ-style pieces people 
asking questions here might be pointed toward.


All of the pieces in the Advanced Topics sections aren't finished to my 
standards yet, but may be useful anyway so I've posted what I've got so 
far.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster