On 06/24/2014 01:27 AM, Andres Freund wrote:
On 2014-06-23 13:00:11 -0700, Jeff Davis wrote:
On Fri, 2014-06-20 at 09:10 +0000, Andres Freund wrote:
To fix, simply move all the all-visible handling outside the critical
section. Doing so means that the PD_ALL_VISIBLE on the page won't be
included in the full page image of the HEAP2_CLEAN record anymore. But
that's fine, the flag will be set by the HEAP2_VISIBLE logged later.
Trying to follow along. I read the discussion about bug #10533.
Before 2a8e1ac5, a block of actions -- visibility test, VM test, set VM
bit, set all-visible -- were inside the critical section and after the
WAL insert.
2a8e1ac5 left the block of actions in the critical section but moved
them before the WAL insert. The commit message seems to indicate that
there's a real problem setting the all-visible flag after the WAL
insert, but I couldn't identify a serious problem there.
Yes, that was confusing. Imo it made the state worse, not better. I'd
asked for clarification somewhere... Heikki?
Hmm. I don't see any live bug caused before 2a8e1ac5 anymore either. I
think I missed the fact that replaying the XLOG_HEAP2_VISIBLE record
always sets the bit in the heap page.
Does your change still make
sense to you and do you see problem with the current state (as of ecac0e2b)?
Hmm, in the current state, it's again possible that the full-page image
doesn't contain the all-visible flag, even though the page in the buffer
does. I guess that's OK, because replaying XLOG_HEAP2_VISIBLE always
sets the flag. My page-comparison tool will complain, so it would be
nice to not do that, but it's a false alarm.
Or we could call heap_page_is_all_visible() before entering the critical
section, if we passed heap_page_is_all_visible() the list of dead tuples
we're removing so that it could ignore them.
I'm a little concerned that we're approaching the WAL rules in an ad-hoc
manner. I don't see an actual problem right now, but we've been fixing
problems with PD_ALL_VISIBLE since the VM was made crash safe almost
exactly three years ago (503c7305). Do we finally feel confident in the
design and its implications?
I'm not particularly happy about all this either, but this section of
code is actually much newer. It was added in fdf9e21196a6/9.3.
I'm not happy with the way we deviate from the normal WAL rules with in
the visibility map either. In the long run, we may be better off to just
take the hit and WAL-log the VM bits like any other update. Basically,
always behave as if checksums were enabled. But for now we just have to
fix bugs as they're found.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers