Re: In-place index updates and HOT (Was: [HACKERS] Patch: Write Amplification Reduction Method (WARM))
Claudio Freirewrote: README.HOT says that that cost is not worth the benefit of preventing a new index write, but I think that it ought to take into account that not all index writes are equal. There is an appreciable difference between inserting a new tuple, and updating one in-place. We can remove the cost (hurting new snapshots by making them go through old heap pages) while preserving most of the benefits (no logically unnecessary index bloat). It's a neat idea. Thanks. I think it's important to both prevent index bloat, and to make sure that only the latest version is pointed to within indexes. There are only so many ways that that can be done. I've tried to come up with a way of doing those two things that breaks as little of heapam.c as possible. As a bonus, some kind of super-pruning of many linked HOT chains may be enabled, which is something that an asynchronous process can do when triggered by a regular prune within a user backend. This is a kind of micro-vacuum that is actually much closer to VACUUM than the kill_prior_tuple stuff, or traditional pruning, in that it potentially kills index entries (just those that were not subsequently updated in place, because the new values for the index differed), and then kills heap tuples, all together, without even keeping around a stub itemId in the heap. And, chaining together HOT chains also lets us chain together pruning. Retail index tuple deletion from pruning needs to be crash safe, unlike LP_DEAD setting. And, well, now that you mention, you don't need to touch indexes at all. You can create the new chain, and "update" the index to point to it, without ever touching the index itself, since you can repoint the old HOT chain's start line pointer to point to the new HOT chain, create a new pointer for the old one and point to it in the new HOT chain's t_tid. Existing index tuples thus now point to the right HOT chain without having to go into the index and make any changes. You do need the new HOT chain to live in the same page for this, however. That seems complicated. The idea that I'm trying to preserve here is the idea that the beginning of a HOT-chain (a definition that includes a "potential HOT chain" -- a single heap tuple that could later receive a HOT UPDATE) unambiguously signals a need for physical changes to indexes in all cases. The idea that I'm trying to move away from is that those physical changes need to be new index insertions (new insertions should only happen when it is logically necessary, because indexed values changed). Note that this can preserve the kill_prior_tuple stuff, I think, because if everything is dead within a single HOT chain (a HOT chain by our current definition -- not a chain of HOT chains) then nobody can need the index tuple. This does require adding complexity around aborted transactions, whose new (potential) HOT chain t_tid "backpointer" is still needed; we must revise the definition of a HOT chain being all_dead to accommodate that. But for the most part, we preserve HOT chains as a thing that garbage collection can independently reason about, process with single page atomic operations while still being crash safe, etc. As far as microvacuum style garbage collection goes, at a high level, HOT chains seem like a good choke point to do clean-up of both heap tuples (pruning) and index tuples. The complexity of doing that seems manageable. And by chaining together HOT chains, you can really aggressively microvacuum many HOT chains on many pages within an asynchronous process as soon as the long running transaction goes away. We lean on temporal locality for garbage collection. There are numerous complications that I haven't really acknowledged but am at least aware of. For one, when I say "update in place", I don't necessarily mean it literally. It's probably possible to literally update in place with unique indexes. For secondary indexes, which should still have heap TID as part of their keyspace (once you go implement that, Claudio), we probably need an index insertion immediately followed by an index deletion, often within the same leaf page. I hope that this design, such as it is, will be reviewed as a thought experiment. What would be good or bad about a design like this in the real world, particularly as compared to alternatives that we know about? Is *some* "third way" design desirable and achievable, if not this one? By "third way" design, I mean a design that is much less invasive than adopting UNDO for MVCC, that still addresses the issues that we currently have with certain types of UPDATE-heavy workloads, especially when there are long running transactions, etc. I doubt that WARM meets this standard, unfortunately, because it doesn't do anything for cases that suffer only due to a long running xact. I don't accept that there is a rigid dichotomy between Postgres style MVCC, and using UNDO for MVCC, and I most certainly don't accept that garbage
Re: In-place index updates and HOT (Was: [HACKERS] Patch: Write Amplification Reduction Method (WARM))
On Fri, Jul 28, 2017 at 8:32 PM, Peter Geogheganwrote: > README.HOT says that that cost is not worth the benefit of > preventing a new index write, but I think that it ought to take into > account that not all index writes are equal. There is an appreciable > difference between inserting a new tuple, and updating one in-place. We > can remove the cost (hurting new snapshots by making them go through old > heap pages) while preserving most of the benefits (no logically > unnecessary index bloat). It's a neat idea. And, well, now that you mention, you don't need to touch indexes at all. You can create the new chain, and "update" the index to point to it, without ever touching the index itself, since you can repoint the old HOT chain's start line pointer to point to the new HOT chain, create a new pointer for the old one and point to it in the new HOT chain's t_tid. Existing index tuples thus now point to the right HOT chain without having to go into the index and make any changes. You do need the new HOT chain to live in the same page for this, however. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
In-place index updates and HOT (Was: [HACKERS] Patch: Write Amplification Reduction Method (WARM))
Pavan Deolaseewrote: One good thing is that the patch is ready and fully functional. So that allows those who are keen to run real performance tests and see the actual impact of the patch. Very true. I see your point. But I would like to think this way: does the technology significantly help many common use cases, that are currently not addressed by HOT? It probably won't help all workloads, that's given. Also, we don't have any credible alternative while this patch has progressed quite a lot. May be Robert will soon present the pluggable storage/UNDO patch and that will cover everything and more that is currently covered by HOT/WARM. That will probably make many other things redundant. Well, I don't assume that it will; again, I just don't know. I agree with your general assessment of things, which is that WARM, EDB's Z-Heap/UNDO project, and things like IOTs have significant overlap in terms of the high-level problems that they fix. While it's hard to say just how much overlap exists, it's clearly more than a little. And, you are right that we don't have a credible alternative in this general category right now. The WARM patch is available today. As you may have noticed, in recent weeks I've been very vocal about the role of index bloat in cases where bloat has a big impact on production workloads. I think that it has an under-appreciated role in workloads that deteriorate over time, as bloat accumulates. Perhaps HOT made such a big difference to workloads 10 years ago not just because it prevented creating new index entries. It also reduced fragmentation of the keyspace in indexes, by never inserting duplicates in the first place. I have some rough ideas related to this, and to the general questions you're addressing. I'd like to run these by you. In-place index updates + HOT Maybe we could improve things markedly in this general area by "chaining together HOT chains", and updating index heap pointers in place, to point to the start of the latest HOT chain in that chain of chains (provided the index tuple was "logically unchanged" -- otherwise, you'd need to have both sets of indexed values at once, of course). Index tuples therefore always point to the latest HOT chain, favoring recent MVCC snapshots over older ones. Pruning --- HOT pruning is great because you can remove heap bloat without worrying about there being index entries with heap item pointers pointing to what is removed. But isn't that limitation as much about what is in the index as it is about what is in the heap? Under this scheme, you don't even have to keep around the old ItemId stub when pruning, if it's a sufficiently old HOT chain that no index points to the corresponding TID. That may not seem like a lot of bloat to have to keep around, but it accumulates within a page until VACUUM runs, ultimately limiting the effectiveness of pruning for certain workloads. Old snapshots/row versions -- Superseding HOT chains have their last heap tuple's t_tid point to the start of the preceding/superseded HOT chain (not their own TID, as today, which is redundant), which may or may not be on the same heap page. That's how old snapshots go backwards to get old versions, without needing their own "logically redundant" index entries. So with UPDATE heavy workloads that are essentially HOT-safe today, performance doesn't tank due to a long running transaction that obstructs pruning within a heap page, and thus necessitates the insertion of new index tuples. That's the main justification for this entire design. It's also possible that pruning can be taught that since only one index update was logically necessary when the to-be-pruned HOT chain was created, it's worth doing a "retail index tuple deletion" against the index tuple that was logically necessary, then completely obliterating the HOT chain, stub item pointer and all. Bloat and locality -- README.HOT argues against HOT chains that span pages, which this is a bit like, on the grounds that it's bad news that every recent snapshot has to go through the old heap page. That makes sense, but only because the temporal locality there is horrible, which would not be the case here. README.HOT says that that cost is not worth the benefit of preventing a new index write, but I think that it ought to take into account that not all index writes are equal. There is an appreciable difference between inserting a new tuple, and updating one in-place. We can remove the cost (hurting new snapshots by making them go through old heap pages) while preserving most of the benefits (no logically unnecessary index bloat). The benefit of HOT is clearly more bloat prevention than not having to visit indexes at all. InnoDB secondary index updates update the index twice: The first time, during the update itself, and the second time, by the purge thread, once the xact commits. Clearly they care about