On Mon, Jan 22, 2007 at 02:51:47PM +0000, Heikki Linnakangas wrote: > I've been looking at the way we do vacuums. > > The fundamental performance issue is that a vacuum generates > nheapblocks+nindexblocks+ndirtyblocks I/Os. Vacuum cost delay helps to > spread the cost like part payment, but the total is the same. In an I/O > bound system, the extra I/O directly leads to less throughput. > > Therefore, we need to do less I/O. Dead space map helps by allowing us > to skip blocks that don't need vacuuming, reducing the # of I/Os to > 2*ndirtyblocks+nindexblocks. That's great, but it doesn't help us if the > dead tuples are spread uniformly. > > If we could piggyback the vacuum I/Os to the I/Os that we're doing > anyway, vacuum wouldn't ideally have to issue any I/O of its own. I've > tried to figure out a way to do that. > > Vacuum is done in 3 phases: > > 1. Scan heap > 2. Vacuum index > 3. Vacuum heap
> Instead of doing a sequential scan, we could perform the 1st phase by > watching the buffer pool, scanning blocks for dead tuples when they're > in memory and keeping track of which pages we've seen. When all pages > have been seen, the tid list is sorted and 1st phase is done. > > In theory, the index vacuum could also be done that way, but let's > assume for now that indexes would be scanned like they are currently. > > The 3rd phase can be performed similarly to the 1st phase. Whenever a > page enters the buffer pool, we check the tid list and remove any > matching tuples from the page. When the list is empty, vacuum is complete. Is there any real reason to demark the start and end of a vacuum? Why not just go to a continuous process? One possibility is to keep a list of TIDs for each phase, though that could prove tricky with multiple indexes. > A variation of the scheme would be to keep scanning pages that are in > cache, until the tid list reaches a predefined size, instead of keeping > track of which pages have already been seen. That would deal better with > tables with hot and cold spots, but it couldn't advance the relfrozenid > because there would be no guarantee that all pages are visited. Also, we > could start 1st phase of the next vacuum, while we're still in the 3rd > phase of previous one. What if we tracked freeze status on a per-page basis? Perhaps track the minimum XID that's on each page. That would allow us to ensure that we freeze pages that are approaching XID wrap. -- Jim Nasby [EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend