This is a WIP patch based on the recent posting by Simon and discussions thereafter. We are trying to do one piece at a time and intention is to post the work ASAP so that we could get early and continuous feedback from the community. We could then incorporate those suggestions in the next WIP patch.
To start with, this patch implements HOT-update for a simple case when there is enough free space in the same block so that it can accommodate the new version of the tuple. A necessary condition for doing HOT-update is that none of the index columns is changed. The old version is marked as HEAP_UPDATE_ROOT and the new version is marked as HEAP_ONLY_TUPLE. If a tuple is HOT-updated, no new index entry is added. When fetching a tuple using an index, if the root tuple is not visible to the given snapshot, the ctid chain is followed until a visible tuple is found or end of HOT-update chain is reached. The prior_xmax/next_xmin chain is validated while following the ctid chain. This patch is generated on the current CVS head. It passes all the regression tests, but I haven't measured any performance impact since thats not the goal for posting this early version. There are several things that are not yet implemented and there are few unresolved issues for which I am looking for community help and feedback. Open Issues: ------------------ - CREATE INDEX needs more work in the HOT context. The existing HOT tuples may require chilling for the CREATE INDEX to work correctly. There are concerns about the crash-safety on chilling operation. Few suggestions were posted in this regard. We need to conclude that and post a working design/patch. - We need to find a way to handle DEAD root tuples, either convert them into stubs or overwrite them with a new version. We can also perform pointer swinging from the index. Again there are concerns about crash-safety and concurrent index-scans working properly. We don't have a community consensus on any of the suggestions in this regard. But hopefully we would converge on some design soon. - Retail VACUUM. We need to implement the block-level vacuum for UPDATEs to find enough free space in the block to do HOT-update. Though we are still discussing how to handle the dead root tuples, we should be able to remove any intermediate dead tuples in the HOT-update chain safely. If we do so without fixing the root tuple, the prior_xmax/next_xmin chain would be broken. A similar problem exists with freezing HOT tuples. Whats Next: ----------------- In the current implementation, an HOT-updated tuple can not be vacuumed because it might be in the middle of the access path to the heap-only visible tuple. This can cause the table to grow rapidly even if autovacuum is turned on. The HOT-update chain also keeps growing if there is enough free space in the block. I am thinking of implementing some sort of HOT-update chain squeezing logic so that intermediate dead tuples can be retired and vacuumed away. This would also help us keep the HOT-update chain small enough so that the chain following does not become unduly costly. I am thinking of squeezing the HOT-update chain while following it in the index fetch. If the root tuple is dead, we follow the chain until the first LIVE or RECENTLY_DEAD tuple is found. The ctid pointer in the root tuple is made point to the first LIVE or RECENTLY_DEAD tuple. All the intermediate DEAD tuples are marked ~HEAP_UPDATE_ROOT so that they can be vacuumed in the next cycle. We hold an exclusive lock on the page while doing so. That should avoid any race conditions. This infrastructure should also help us retail vacuum the block later. Please let me know your comments. Thanks, Pavan -- EnterpriseDB http://www.enterprisedb.com
Description: GNU Zip compressed data
---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org