On Tue, Aug 2, 2016 at 5:51 AM, Amit Kapila <amit.kapil...@gmail.com> wrote: > Why we need to add a record in all indexes if only the key > corresponding to one of indexes is updated? Basically, if the tuple > can fit on same page, why can't we consider it as HOT (or HPT - heap > partial tuple or something like that), unless it updates all the keys > for all the indexes. Now, we can't consider such tuple versions for > pruning as we do for HOT. The downside of this could be that we might > need to retain some of the line pointers for more time (as we won't be > able to reuse the line pointer till it is used in any one of the > indexes and those could be reused once we make next non-HOT update). > However, this should allow us not to update the indexes for which the > corresponding column in tuple is not updated. I think it is a basic > premise that if any index column is updated then the update will be > considered as non-HOT, so there is a good chance that I might be > missing something here.
Well, I think that the biggest advantage of a HOT update is the fact that it enables HOT pruning. In other words, we're not primarily trying to minimize index traffic; we're trying to make cleanup of the heap cheaper. So this could certainly be done, but I'm not sure it would buy us enough to be worth the engineering effort involved. Personally, I think that incremental surgery on our current heap format to try to fix this is not going to get very far. If you look at the history of this, 8.3 was a huge release for timely cleanup of dead tuple. There was also significant progress in 8.4 as a result of 5da9da71c44f27ba48fdad08ef263bf70e43e689. As far as I can recall, we then made no progress at all in 9.0 - 9.4. We made a very small improvement in 9.5 with 94028691609f8e148bd4ce72c46163f018832a5b, but that's pretty niche. In 9.6, we have "snapshot too old", which I'd argue is potentially a large improvement, but it was big and invasive and will no doubt pose code maintenance hazards in the years to come; also, many people won't be able to use it or won't realize that they should use it. I think it is likely that further incremental improvements here will be quite hard to find, and the amount of effort will be large relative to the amount of benefit. I think we need a new storage format where the bloat is cleanly separated from the data rather than intermingled with it; every other major RDMS works that way. Perhaps this is a case of "the grass is greener on the other side of the fence", but I don't think so. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers