On Thu, 2006-11-09 at 18:28 -0500, Tom Lane wrote: > "Simon Riggs" <[EMAIL PROTECTED]> writes: > > As more UPDATEs take place these tuple chains would grow, making > > locating the latest tuple take progressively longer. > > This is the part that bothers me --- particularly the random-access > nature of the search. I wonder whether you couldn't do something > involving an initial table fill-factor of less than 50%, and having > the first updated version living on the same heap page as its parent. > Only when the active chain length was more than one (which you > hypothesize is rare) would it actually be necessary to do a random > access into the overflow table.
Thats appropriate sometimes, not others, but I'll investigate this further so that its possible to take advantage of non-zero fillfactors when they exist. There's a number of distinct use-cases here: If you have a very small, heavily updated table it makes a lot of sense to use lower fillfactors as well. If you have a larger table, using fillfactor 50% immediately doubles the size of the table. If the updates are uneven, as they mostly are because of the Benfold distribution/Pareto principle, then it has been found that leaving space on block doesn't help the heavily updated portions of a table, whereas it hinders the lightly updated portions of a table. TPC-C and TPC-B both have uniformly distributed UPDATEs, so its easy to use the fillfactor to great advantage there. > More generally, do we need an overflow table at all, rather than having > these overflow tuples living in the same file as the root tuples? As > long as there's a bit available to mark a tuple as being this special > not-separately-indexed type, you don't need a special location to know > what it is. This might break down in the presence of seqscans though. HOT currently attempts to place a subsequent UPDATE on the same page of the overflow relation, but this doesn't happen (yet) for placing multiple versions on same page. IMHO it could, but will think about it. > > This allows the length of a typical tuple chain to be extremely short in > > practice. For a single connection issuing a stream of UPDATEs the chain > > length will no more than 1 at any time. > > Only if there are no other transactions being held open, which makes > this claim a lot weaker. True, but Nikhil has run tests that clearly show HOT outperforming current situation in the case of long running transactions. The need to optimise HeapTupleSatisfiesVacuum() and avoid long chains does still remain a difficulty for both HOT and the current situation. > > HOT can only work in cases where a tuple does not modify one of the > > columns defined in an index on the table, and when we do not alter the > > row length of the tuple. > > Seems like "altering the row length" isn't the issue, it's just "is > there room on the page for the new version". Again, a generous > fillfactor would give you more flexibility. The copy-back operation can only work if the tuple fits in the same space as the root tuple. If it doesn't you end up with a tuple permanently in the overflow relation. That might not worry us, I guess. Also, my understanding was that an overwrite operation could not vary the length of a tuple (at least according to code comments). > > [We'll be able to do that more efficiently when > > we have plan invalidation] > > Uh, what's that got to do with it? Currently the HOT code dynamically tests to see if the index columns have been touched. If we had plan invalidation that would be able to be assessed more easily at planning time, in cases where there is no BEFORE trigger. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq