Re: [PERFORM] database size growing continously

2009-10-29 Thread Chris Ernst
Hi Peter,

Sounds like you're experiencing index bloat and vacuums do nothing to
help that.  You can do one of 2 thing to remedy this:

1) The fastest and simplest (but most disruptive) way is to use REINDEX.
 But this will exclusively lock the table while rebuilding the indexes:

  REINDEX TABLE phaseangle;

2) The slower but less disruptive way is to do a concurrent build of
each index and then drop the old ones. For example, to rebuild the i
index:

  CREATE INDEX CONCURRENTLY i_new ON phaseangle (indexed columns);
  DROP INDEX i;
  ALTER INDEX i_new RENAME TO i;
  ANALYZE phaseangle (indexed columns);

Do this regularly to keep the index sizes in check.

- Chris

Peter Meszaros wrote:
 Hi All,
 
 I use postgresql 8.3.7 as a huge queue. There is a very simple table
 with six columns and two indices, and about 6 million records are
 written into it in every day continously commited every 10 seconds from
 8 clients. The table stores approximately 120 million records, because a
 cron job daily deletes those ones are older than 20 day. Autovacuum is
 on and every settings is the factory default except some unrelated ones
 (listen address, authorization). But my database is growing,
 characteristically ~600MByte/day, but sometimes much slower (eg. 10MB,
 or even 0!!!).
 
 I've also tried a test on another server running the same postgresql,
 where 300 million record was loaded into a freshly created database,
 and 25 million was deleted with single DELETE command.  The 'vacuum
 verbose phaseangle;' command seems to be running forever for hours:
 
 phasor=# vacuum VERBOSE phaseangle;
 INFO:  vacuuming public.phaseangle
 INFO:  scanned index i to remove 2796006 row versions
 DETAIL:  CPU 9.49s/120.30u sec elapsed 224.20 sec.
 INFO:  scanned index t to remove 2796006 row versions
 DETAIL:  CPU 13.57s/105.70u sec elapsed 192.71 sec.
 INFO:  phaseangle: removed 2796006 row versions in 24748 pages
 DETAIL:  CPU 0.65s/0.30u sec elapsed 39.97 sec.
 INFO:  scanned index i to remove 2795924 row versions
 DETAIL:  CPU 9.58s/121.63u sec elapsed 239.06 sec.
 INFO:  scanned index t to remove 2795924 row versions
 DETAIL:  CPU 13.10s/103.59u sec elapsed 190.84 sec.
 INFO:  phaseangle: removed 2795924 row versions in 24743 pages
 DETAIL:  CPU 0.68s/0.28u sec elapsed 40.21 sec.
 INFO:  scanned index i to remove 2796014 row versions
 DETAIL:  CPU 9.65s/117.28u sec elapsed 231.92 sec.
 INFO:  scanned index t to remove 2796014 row versions
 DETAIL:  CPU 13.48s/103.59u sec elapsed 194.49 sec.
 INFO:  phaseangle: removed 2796014 row versions in 24774 pages
 DETAIL:  CPU 0.69s/0.28u sec elapsed 40.26 sec.
 INFO:  scanned index i to remove 2795935 row versions
 DETAIL:  CPU 9.55s/119.02u sec elapsed 226.85 sec.
 INFO:  scanned index t to remove 2795935 row versions
 DETAIL:  CPU 13.09s/102.84u sec elapsed 194.74 sec.
 INFO:  phaseangle: removed 2795935 row versions in 25097 pages
 DETAIL:  CPU 0.67s/0.28u sec elapsed 41.21 sec.
 
 still running...
 
 These are the very same problems?
 Should I delete mor frequently in smaller chunks? It seems to have a
 limit...
 
 Thanks 
 
 Peter
 


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] More speed counting rows

2009-07-27 Thread Chris Ernst

Developer wrote:

Hello,

I am trying to optimize the count of files when I am using filters
(select by some row/s parameter/s)

In this case I think that postgresql really count all files.
Resulting in unacceptable times of 4 seconds in http server response.
Triggers+store in this case do not see very acceptable, because I need
store 1.5 millions of counting possibilities.

My question is:
Any method for indirect count like ordered indexes + quadratic count?
Any module?
Any suggestion?



I had a similar problem where HTTP requests triggered a count(*) over a 
table that was growing rapidly.  The bigger the table got, the longer 
the count took.  In my case, however, the counts only have to be a 
reasonable estimate of the current state, so I solved this problem with 
a count_sums table that gets updated every 30 minutes using a simple 
perl script in a cron job.  The HTTP requests now trigger a very fast 
select from a tiny, 9 row, 2 column table.


How up to date do the counts need to be?  If the count takes 4 
seconds, can you run it every minute and store the counts in a table for 
retrieval by the HTTP requests?  Or does it absolutely have to be the 
exact count at the moment of the request?


If it needs to be more real-time, you could expand on this by adding 
post insert/delete triggers that automatically update the counts table 
to keep it current.  In my case it just wasn't necessary.


- Chris

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance