On Fri, Jun 23, 2006 at 03:10:39PM -0400, Mark Woodward wrote: > This is NOT an "in-place" update. The whole MVCC strategy of keeping old > versions around doesn't change. The only thing that does change is one > level of indirection. Rather than keep references to all versions of all > rows in indexes, keep only a reference to the first or "key" row of each > row, and have the first version of a row form the head of a linked list to > subsequent versions of each row. The list will be in decending order.
I thought of another issue with this. If you move away from storing each row in the indexes, you can pretty much forget bitmap index scans. They pretty much rely on every row being represented, not just a subset. Until you go to the heap you don't know if a tuple will match, which is precisely what the bitmap scan is trying to avoid. You could follow the links, but that destroys the nice sequential access properties. > In the vast majority of cases, the overhead of this action will be > trivial. In an unmodified row, you're there. In a modified row, you have > one extra lookup. In extream cases, you may have to go back a few > versions, but I don't see that as a common behavior. I wonder if looking at old versions is really all that uncommon. A large reporting query which runs for hours will probably be looking at a lot of old versions. These are the queries that will be hit the hardest. If you're trying to avoid index bloat, I wonder if it wouldn't be better to tackle this from the other end. In indexes, allow a row to carry multiple CTIDs. That way a new version only requires adding six more bytes, rather than a whole new tuple. Have a nice day, -- Martijn van Oosterhout <email@example.com> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to > litigate.
Description: Digital signature