On Wed, 2007-08-01 at 14:36 +0530, Pavan Deolasee wrote: > Please see the attached version 11 of HOT patch > > The concept of marking the pruned tuples with LP_DELETE and > reusing such tuples for subsequent UPDATEs has been removed > and replaced with a simpler mechanism of repairing the page > fragmentation as and when possible.
OK, sounds good. Some comments: BufferIsLockedForCleanup() should be named BufferIsAvilableForCleanup(). There is no cleanup mode, what we mean is that there is only one pin; the comments say "If we are lucky enough to have already acquired the cleanup lock", which made me look for the point where we had executed LockBufferForCleanup(), which we never do. heap_page_prune() needs to comment what prune_hard means. I think you mean true=prune page, false=prune just this hot chain. It might be better to have a prune type so that the code reads as PRUNE_HOT_CHAIN or PRUNE_WHOLE_PAGE. Does it means the same thing as in heap_prune_tuplechain()? I'm concerned that HeapTupleSatisfiesAbort() may be very expensive and yet almost never return true. Do we need this code at this point? It would be good to put some explanatory notes somewhere. Do all calls to PageGetFreeSpace() get replaced by PageHeapGetFreeSpace()? To answer some of your XXXs: XXX Should we be a bit conservative here ? At the place you ask this scan->rs_pageatatime is true, so we're going to scan the whole page. So any new updates would be setting hint bits anyway, so we are going to end up with a dirty page whether you do this or not, so I think yes, prune the whole page. XXX Should we set PageSetPrunable on this page ? If the transaction aborts, there will be a prunable tuple in this page. Think you need to explain a little more...but we probably shouldn't tune for aborts. XXX Should we set hint on the newbuf as well ? If the transaction aborts, there would be a prunable tuple in the newbuf. Think you need to explain a little more...but we probably shouldn't tune for aborts. XXX Should we update the FSM information of this page ? (after pruning) No. Updating FSM would only take place if updates are taking place on this block. We would make this page a target for other INSERTs and UPDATEs, so would effectively bring more contention onto those pages. Bad Thing. XXX We may want to tweak the percentage value below: (1.2 * RelationGetAvgFSM(relation) Not sure what difference the 1.2 makes... So that means I agree with all of your XXXs apart from the last one. > The logic of chain pruning has been simplified a lot. In addition, > there > are fewer new WAL log records. We rather reuse the existing WAL > record types to log the operations. I can't find any mention of XLogInserts for the new record types. e.g. XLOG_HEAP2_PRUNEHOT is only mentioned on one line in this patch. I this a mistake, or could you explain? > Few 4 hour DBT2 runs have confirmed that this simplification hasn't > taken away any performance gains, rather we are seeing better > performance > with the changes. The gain can be attributed to the fact that now > more > HOT updates are possible even if the tuple length changes between > updates (since we do the complete page defragmentation) > > The changes are mostly isolated in the above area apart from some > stray bug fixes. It would be good to see the results... -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate