Re: [PERFORM] Need help in setting optimal configuration for a huge

Steinar H. Gunderson Sun, 23 Oct 2005 10:03:16 -0700

On Sun, Oct 23, 2005 at 09:31:44AM -0700, Craig A. James wrote:
> COUNT() -- There is no good substitute.  What I do is create a new column, 
> "ROW_NUM" with an auto-incrementing sequence.  Every time I insert a row, 
> it gets a new value.  Unfortunately, this doesn't work if you ever delete a 
> row.  The alternative is a more complex pair of triggers, one for insert 
> and one for delete, that maintains the count in a separate one-row table.  
> It's a nuisance, but it's a lot faster than doing a full table scan for 
> every COUNT().


This will sometimes give you wrong results if your transactions ever roll
back, for instance. The correct way to do it is to maintain a table of
deltas, and insert a new positive record every time you insert rows, and a
negative one every time you delete them (using a trigger, of course). Then
you can query it for SUM(). (To keep the table small, run a SUM() in a cron
job or such to combine the deltas.)

There has, IIRC, been talks of supporting fast (index-only) scans on
read-only (ie. archived) partitions of tables, but it doesn't look like this
is coming in the immediate future. I guess others know more than me here :-)

> MIN() and MAX() -- These are surprisingly slow, because they seem to do a 
> full table scan EVEN ON AN INDEXED COLUMN!  I don't understand why, but 
> happily there is an effective substitute:

They are slow because PostgreSQL has generalized aggregates, ie. MAX() gets
fed exactly the same data as SUM() would. PostgreSQL 8.1 (soon-to-be
released) can rewrite a MAX() or MIN() to an appropriate LIMIT form, though,
which solves the problem.

/* Steinar */
-- 
Homepage: http://www.sesse.net/


---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

Re: [PERFORM] Need help in setting optimal configuration for a huge

Reply via email to