On Mon, 2007-01-22 at 12:18 -0500, Bruce Momjian wrote:
> Heikki Linnakangas wrote:
> > In any case, for the statement "Index cleanup is the most expensive part
> > of vacuum" to be true, you're indexes would have to take up 2x as much
> > space as the heap, since the heap is scanned twice. I'm sure there's
> > databases like that out there, but I don't think it's the common case.
> I agree it index cleanup isn't > 50% of vacuum. I was trying to figure
> out how small, and it seems about 15% of the total table, which means if
> we have bitmap vacuum, we can conceivably reduce vacuum load by perhaps
> 80%, assuming 5% of the table is scanned.
Clearly keeping track of what needs vacuuming will lead to a more
efficient VACUUM. Your math applies to *any* design that uses some form
of book-keeping to focus in on the hot spots.
On a separate thread, Heikki has raised a different idea for VACUUM.
Heikki's idea asks an important question: where and how should DSM
information be maintained? Up to now everybody has assumed that it would
be maintained when DML took place and that the DSM would be a
transactional data structure (i.e. on-disk). Heikki's idea requires
similar bookkeeping requirements to the original DSM concept, but the
interesting aspect is that the DSM information is collected off-line,
rather than being an overhead on every statement's response time.
That idea seems extremely valuable to me.
One of the main challenges is how we cope with large tables that have a
very fine spray of updates against them. A DSM bitmap won't help with
that situation, regrettably.
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not