Re: [PERFORM] When to do a vacuum for highly active table

2005-09-06 Thread Chris Browne
[EMAIL PROTECTED] (Rigmor Ukuhe) writes:

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:pgsql-performance-
 [EMAIL PROTECTED] On Behalf Of Markus Benne
 Sent: Wednesday, August 31, 2005 12:14 AM
 To: pgsql-performance@postgresql.org
 Subject: [PERFORM] When to do a vacuum for highly active table
 
 We have a highly active table that has virtually all
 entries updated every 5 minutes.  Typical size of the
 table is 50,000 entries, and entries have grown fat.
 
 We are currently vaccuming hourly, and towards the end
 of the hour we are seeing degradation, when compared
 to the top of the hour.
 
 Vaccum is slowly killing our system, as it is starting
 to take up to 10 minutes, and load at the time of
 vacuum is 6+ on a Linux box.  During the vacuum,
 overall system is goin unresponsive, then comes back
 once vacuum completes.

 Play with vacuum_cost_delay option. In our case it made BIG difference
 (going from very heavy hitting to almost unnoticed vacuuming.)

That helps only if the ONLY problem you're having is from the direct
I/O of the vacuum.

If part of the problem is that the table is so large that it takes 4h
for VACUUM to complete, thereby leaving a transaction open for 4h,
thereby causing other degradations, then vacuum_cost_delay will have a
NEGATIVE impact, as it will mean that the vacuum on that table will
take even /more/ than 4h.  :-(

For the above scenario, it is almost certain that the solution comes
in two pieces:

1.  VACUUM FULL / CLUSTER to bring the size down.

The table has grown fat, and no number of repetitions of plain
vacuum will fix this.

2.  Do plain vacuum on the table VASTLY more frequently, probably
every 5 minutes, possibly more often than that.

By doing this, you prevent things from getting so bad again.

By the way, in this sort of situation, _ANY_ transaction that runs
more than about 5 minutes represents a serious enemy to performance,
as it will tend to cause the hot table to get fatter.
-- 
(reverse (concatenate 'string gro.gultn @ enworbbc))
http://www.ntlug.org/~cbbrowne/linux.html
TECO Madness: a moment of regret, a lifetime of convenience.
-- Kent Pitman

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] When to do a vacuum for highly active table

2005-09-05 Thread Rigmor Ukuhe

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:pgsql-performance-
 [EMAIL PROTECTED] On Behalf Of Markus Benne
 Sent: Wednesday, August 31, 2005 12:14 AM
 To: pgsql-performance@postgresql.org
 Subject: [PERFORM] When to do a vacuum for highly active table
 
 We have a highly active table that has virtually all
 entries updated every 5 minutes.  Typical size of the
 table is 50,000 entries, and entries have grown fat.
 
 We are currently vaccuming hourly, and towards the end
 of the hour we are seeing degradation, when compared
 to the top of the hour.
 
 Vaccum is slowly killing our system, as it is starting
 to take up to 10 minutes, and load at the time of
 vacuum is 6+ on a Linux box.  During the vacuum,
 overall system is goin unresponsive, then comes back
 once vacuum completes.

Play with vacuum_cost_delay option. In our case it made BIG difference
(going from very heavy hitting to almost unnoticed vacuuming.)

Hope it helps.

Rigmor Ukuhe

 
 If we run vacuum less frequently, degradation
 continues to the point that we can't keep up with the
 throughput, plus vacuum takes longer anyway.
 
 Becoming quite a pickle:-)
 
 We are thinking of splitting the table in two: the
 part the updates often, and the part the updates
 infrequently as we suspect that record size impacts
 vacuum.
 
 Any ideas?
 
 
 Thanks,
 Mark
 
 -
 
 ---(end of broadcast)---
 TIP 2: Don't 'kill -9' the postmaster


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] When to do a vacuum for highly active table

2005-08-30 Thread Tom Lane
Markus Benne [EMAIL PROTECTED] writes:
 We have a highly active table that has virtually all
 entries updated every 5 minutes.  Typical size of the
 table is 50,000 entries, and entries have grown fat.

 We are currently vaccuming hourly, and towards the end
 of the hour we are seeing degradation, when compared
 to the top of the hour.

On something like this, you really need to be vacuuming more often
not less so; I'd think about how to do it every five or ten minutes 
rather than backing off.  With only 50K rows it should really not take
more than a couple of seconds to do the vacuum.  When you wait till
there are 600K dead rows, it's going to take awhile, plus you are
suffering across-the-board performance degradation from all the dead
rows.

If you are using PG 8.0, there are some vacuum cost knobs you can
fiddle with to slow down vacuum so it doesn't impose as much I/O load.
Ideally you could get it to where you could run vacuum as often as
you need to without noticing much impact on foreground processing.

If you're not using 8.0 ... maybe it's time to update.

Another thing you might want to do is look at vacuum verbose output,
which will give you some idea of the time spent in each step.  It might
be there are specific aspects that could be improved.

 We are thinking of splitting the table in two: the
 part the updates often, and the part the updates
 infrequently as we suspect that record size impacts
 vacuum.

You just said that virtually all rows update constantly --- where's
the infrequent part?

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] When to do a vacuum for highly active table

2005-08-30 Thread mark
On Tue, Aug 30, 2005 at 05:29:17PM -0400, Tom Lane wrote:
 Markus Benne [EMAIL PROTECTED] writes:
  We have a highly active table that has virtually all
  entries updated every 5 minutes.  Typical size of the
  table is 50,000 entries, and entries have grown fat.
 ...
  We are thinking of splitting the table in two: the
  part the updates often, and the part the updates
  infrequently as we suspect that record size impacts
  vacuum.
 You just said that virtually all rows update constantly --- where's
 the infrequent part?

I think he means splitting it vertically, instead of horizontally, and
it sounds like an excellent idea, if a large enough portion of each
record is in fact mostly fixed. Otherwise, PostgreSQL is copying data
multiple times, only to have the data expire as part of a dead row.

I've already started to notice such issues with postgresql - but more
because I'm using low-end hardware, and I'm projecting the effect for
when our database becomes much larger with much higher demand on the
database.

This is the sort of scenario where a database without transactional
integrity would significantly out-perform one designed around it. If
records are fixed sized, and updated in place, these problems would
occur far less often. Is it heresy to suggest MySQL in here? :-)

I switched from MySQL to PostgreSQL several months ago, and haven't
looked back - but they do work differently, and for certain uses, one
can destroy the other. Using a MyISAM table would be the way I would
go with this sort of problem.

Cheers,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] When to do a vacuum for highly active table

2005-08-30 Thread Chris Browne
[EMAIL PROTECTED] (Markus Benne) writes:
 We have a highly active table that has virtually all
 entries updated every 5 minutes.  Typical size of the
 table is 50,000 entries, and entries have grown fat.

 We are currently vaccuming hourly, and towards the end
 of the hour we are seeing degradation, when compared
 to the top of the hour.

You're not vacuuming the table nearly often enough.

You should vacuum this table every five minutes, and possibly more
often than that.

[We have some tables like that, albeit smaller than 50K entries, which
we vacuum once per minute in production...]

 We are thinking of splitting the table in two: the part the updates
 often, and the part the updates infrequently as we suspect that
 record size impacts vacuum.

There's *some* merit to that.

You might discover that there's a hot spot that needs to be vacuumed
once per minute.

But it may be simpler to just hit the table with a vacuum once every
few minutes even though some tuples are seldom updated.
-- 
output = reverse(gro.gultn @ enworbbc)
http://cbbrowne.com/info/spreadsheets.html
Signs  of  a  Klingon  Programmer #3:  By   filing this  TPR you have
challenged the honor of my family. Prepare to die!

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] When to do a vacuum for highly active table

2005-08-30 Thread Tom Lane
[EMAIL PROTECTED] writes:
 I think he means splitting it vertically, instead of horizontally, and
 it sounds like an excellent idea, if a large enough portion of each
 record is in fact mostly fixed. Otherwise, PostgreSQL is copying data
 multiple times, only to have the data expire as part of a dead row.

Only up to a point.  Fields that are wide enough to get toasted
out-of-line (multiple Kb) do not get physically copied if there's
a row update that doesn't affect them.  We don't really have enough
information about his table to guess whether there's any point in
manually partitioning the columns, but my leaning would be probably
not --- the overhead in joining the resulting two tables would be
high enough that you'd need a heck of a big improvement to justify it.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq