Hi Tom, (Please correct me where I'm wrong)
Is it possible to reduce the performance impact of dead tuples esp when the index is used? Right now performance goes down gradually till we vacuum (something like a 1/x curve). My limited understanding of current behaviour is the search for a valid row's tuple goes from older tuples to newer ones via forward links (based on some old docs[1]). How about searching from newer tuples to older tuples instead, using backward links? My assumption is newer tuples are more likely to be wanted than older ones - and so the number of tuples to search through will be less this way. **If index update is ok. If a tuple is inserted, the index record is updated to point to inserted tuple, and the inserted tuple is made to point to a previous tuple. e.g. Index-> old tuple->older tuple->oldest tuple Index-> New tuple->old tuple->older tuple->oldest tuple **if index update not desirable Index points to first tuple (valid or not). If a tuple is inserted, the first tuple is updated to point to inserted tuple, and the inserted tuple is made to point to a previous tuple. e.g. Index-> first tuple->old tuple->older tuple->oldest tuple Index-> first tuple-> New tuple->old tuple->older tuple->oldest tuple If this is done performance might not deterioriate as much when using index scans right? I'm not sure if a backward links would help for sequential scans, which are usually best done forward. Regards, Link. [1] http://developer.postgresql.org/pdf/transactions.pdf Tuple headers contain: • xmin: transaction ID of inserting transaction • xmax: transaction ID of replacing/ deleting transaction (initially NULL) • forward link: link to newer version of same logical row, if any Basic idea: tuple is visible if xmin is valid and xmax is not. "Valid" means "either committed or the current transaction". If we plan to update rather than delete, we first add new version of row to table, then set xmax and forward link in old tuple. Forward link will be needed by concurrent updaters (but not by readers). At 10:53 AM 5/1/02 -0400, Tom Lane wrote: >estimates. [ thinks... ] Actually I think we might just be >double-counting if we did. The dead tuples surely should not count as >part of the number of returned rows. We already do account for the >I/O effort to read them (because I/O is estimated based on the total ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster