Re: [HACKERS] vacuum, performance, and MVCC

Jan Wieck Sun, 25 Jun 2006 10:31:56 -0700

On 6/24/2006 4:10 PM, Hannu Krosing wrote:

Ühel kenal päeval, L, 2006-06-24 kell 15:44, kirjutas Jan Wieck:
>> That fixes the symptom, not the problem. The problem is performance
>> steadily degrades over time.
>> No, you got it backwards. The performance degradation is the symptom.
> The problem is that there are too many dead tuples in the table.  There
> is one way to solve that problem -- remove them, which is done by
> running vacuum.

Precisely.

> There are some problems with vacuum itself, that I agree with.  For
> example it would be good if a long-running vacuum wouldn't affect a
> vacuum running in another table because of the long-running transaction
> effect it has.  It would be good if vacuum could be run partially over a
> table.  It would be good if there was a way to speed up vacuum by using
> a dead space map or something.
It would be good if vacuum wouldn't waste time on blocks that don't haveany possible work in them. Vacuum has two main purposes. A) remove deadrows and B) freeze xids. Once a block has zero deleted rows and all xidsare frozen, there is nothing to do with this block and vacuum shouldskip it until a transaction updates that block.
This requires 2 bits per block, which is 32K per 1G segment of a heap.Clearing the bits is done when the block is marked dirty. This wayvacuum would not waste any time and IO on huge slow changing tables.That part, sequentially scanning huge tables that didn't change much iswhat keeps us from running vacuum every couple of seconds.
Seems like a plan.
Still, there is another problem which is not solved by map approach
only, at least with current implementation of vacuum.

This is the fact that we need to do full scan over index(es) to clean up
pointers to removed tuples. And huge tables tend to have huge indexes.

Right, now that you say it I remember why this wasn't so easy as itsounded at the beginning.

Obviously there is no other way to find an index tuple without asequential scan other than doing an index scan. So vacuum would have toestimate based on the bitmaps if it could be beneficial (huge table,little vacuumable pages) to actually remove/flag single index tuplesbefore removing the heap tuple. This can be done in advance to removingthe heap tuple because index tuples might not be there to begin with.


However, that is a very costly thing to do and not trivial to implement.


Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== [EMAIL PROTECTED] #

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

              http://archives.postgresql.org

Re: [HACKERS] vacuum, performance, and MVCC

Reply via email to