This is the second proposal for Dead Space Map (DSM). Here is the previous discussion: http://archives.postgresql.org/pgsql-hackers/2006-12/msg01188.php
I'll post the next version of the Dead Space Map patch to -patches. I've implemented 2bits/page bitmap and new vacuum commands. Memory management and recovery features are not done yet. I think it's better to get DSM and HOT together. DSM is good at complex updated cases but not at heavily updated cases. HOT has opposite aspects, as far as I can see. I think they can cover each other. 2bits/page bitmap ----------------- Each heap pages have 4 states for dead space map; HIGH, LOW, UNFROZEN and FROZEN. VACUUM uses the states to reduce the number of target pages. - HIGH : High priority to vacuum. Maybe many dead tuples in the page. - LOW : Low priority to vacuum Maybe few dead tuples in the page. - UNFROZEN : No dead tuples, but some unfrozen tuples in the page. - FROZEN : No dead nor unfrozen tuples in the page. If we do UPDATE a tuple, the original page containing the tuple is marked as HIGH and the new page where the updated tuple is placed is marked as LOW. When we commit the transaction, the updated tuples needs only FREEZE. That's why the after-page is marked as LOW. However, If we rollback, the after-page should be vacuumed, so we should mark the page LOW, not UNFROZEN. We don't know the transaction will commit or rollback at the UPDATE. If we combine this with the HOT patch, pages with HOT tuples are probably marked as UNFROZEN because we don't bother vacuuming HOT tuples. They can be removed incrementally and doesn't require explicit vacuums. In future work, we can do index-only-scan for tuples that is in UNFROZEN or FROZEN pages. (currently not implemented) VACUUM commands --------------- VACUUM now only scans the pages that possibly have dead tuples. VACUUM ALL, a new syntax, behaves as the same as before. - VACUUM FULL : Not changed. scans all pages and compress them. - VACUUM ALL : Scans all pages; Do the same behavior as previous VACUUM. - VACUUM : Scans only HIGH pages usually, but also LOW and UNFROZEN pages on vacuums in the cases for preventing XID wraparound. The commitment of oldest XID for VACUUM is not changed. There should not be tuples that XIDs are older than (Current XID - vacuum_freeze_min_age) after VACUUM. If the VACUUM can guarantee the commitment, it scans only HIGH pages. Otherwise, it scans HIGH, LOW and UNFROZEN pages for FREEZE. Performance issues ------------------ * Enable/Disable DSM tracking per tables DSM requires more or less additional works. If we know specific tables where DSM does not work well, ex. heavily updated small tables, we can disable DSM for it. The syntax is: ALTER TABLE name SET (dsm=true/false); * Dead Space State Cache The DSM management module is guarded using one LWLock, DeadSpaceLock. Almost all accesses to DSM requires only shared lock, but the frequency of shared lock was very high (tied with BufMappingLock) in my research. To avoid the lock contention, I added a cache of dead space state in BufferDesc flags. Backends see the flags first, and avoid locking if no need to * Agressive freezing We will freeze tuples in dirty pages using OldestXmin but FreezeLimit. This is for making FROZEN pages but not UNFROZEN pages as far as possible in order to reduce works in XID wraparound vacuums. Memory management ----------------- In current implementation, DSM allocates a bunch of memory at start up and we cannot modify it in running. It's maybe enough because DSM consumes very little memory -- 32MB memory per 1TB database. There are 3 parameters for FSM and DSM. - max_fsm_pages = 204800 - max_fsm_relations = 1000 (= max_dsm_relations) - max_dsm_pages = 4096000 I'm thinking to change them into 2 new paramaters. We will allocates memory for DSM that can hold all of estimated_database_size, and for FSM 50% or something of the size. Is this reasonable? - estimated_max_relations = 1000 - estimated_database_size = 4GB (= about max_fsm_pages * 8KB * 2) Recovery -------- I've already have a recovery extension. However, it can recover DSM but not FSM. Do we also need to restore FSM? If we don't, unreusable pages might be left in heaps. Of cource it could be reused if another tuple in the page are updated, but VACUUM will not find those pages. Comments and suggestions are really appreciated. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly