On 18.09.2013 22:55, Jeff Janes wrote:
On Mon, Sep 16, 2013 at 6:59 AM, Heikki Linnakangas<hlinnakan...@vmware.com
wrote:
Here's a rebased version of the patch, including the above-mentioned
fixes. Nothing else new.

I've applied this to 0892ecbc015930d, the last commit to which it applies
cleanly.

When I test this by repeatedly incrementing a counter in a randomly chosen
row, then querying the whole table and comparing the results to what my
driver knows they should be, I get discrepancies.

Ok, I found the bug. The problem was that when a HOT chain begins with a dead tuple, when the page was frozen, the dead tuple was not removed, but the xmin of the live tuple in the chain was replaced with FrozenXid. That breaks the HOT-chain following code, which checks that the xmin of the next tuple in the chain matches the xmax of the previous tuple.

I fixed that by simply not freezing a page which contains any dead tuples. That's OK because the page will be visited by vacuum before it becomes old enough to be "mature". However, it means that the invariant that a page can only contain XIDs within one XID-LSN range, determined by the LSN, is no longer true. AFAICS it everything still works, but I would feel more comfortable if we could uphold that invariant, for debugging reasons if nothing else. Will have to give that some more thought..

Thanks for the testing! I just posted an updated version of the patch elsewhere in this thread.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to