On Thu, Mar 10, 2016 at 9:18 AM, Tomas Vondra <tomas.von...@2ndquadrant.com> wrote: > I've looked at this patch today, mostly to educate myself, so this > probably should not count as a full review. Anyway, the patch seems in > excellent shape - it'd be great if all patches (including those written > by me) had this level of comments/docs.
Thanks. I try. This patch is something I've been working on slowly and steadily for over 2 years. It was time to see it through. >> Note that no function currently checks that the index is consistent >> with the heap, which would be very useful (that's probably how I'd >> eventually target the heapam, actually). > > I'm afraid I don't understand what "target the heapam" means. Could you > elaborate? Just that that's how I'd make amcheck verify that the heap was sane, if I was going to undertake that project. Or, I'd start there. > I agree. It'd be nice to have a tool that also checks for data > corruption the a lower level (e.g. that varlena headers are not > corrupted etc.) but that seems like a task for some other tool. I'm not sure that that's a task for another tool. I think that this tool is scoped at detecting corruption, and that could work well for the heap, too. There might be fewer interesting things to test about the heap when indexes aren't involved. Following HOT chains through an index, and verifying things like that about the heap as we go could detect a lot of problematic cases. > Can we come up with names that more clearly identify the difference > between those two functions? I mean, _parent_ does not make it > particularly obvious that the second function acquires exclusive lock > and performs more thorough checks. Dunno about that. It's defining characteristic is that it checks child pages against their parent IMV. Things are not often defined in terms of their locking requirements. > This also applies to the name of the contrib module - it's not > particularly obvious what "amcheck" unless you're familiar with the > concept of access methods. Which is quite unlikely for regular users. > Maybe something like "check_index" would be more illustrative? I think that given the heap may be targeted in the future, amcheck is more future-proof. I think Robert might have liked that name a year ago or something. In general, I'm not too worried about what the contrib module is called in the end. > Well, I wouldn't count myself as an expert here, but do we actually need > such protection? I mean, we only do such checks when holding an > exclusive lock on the parent relation, no? And even if we don't vacuum > can only remove entries from the pages - why should that cause > violations of any invariants? I think that a more worked out explanation for why the ExclusiveLock is needed is appropriate. I meant to do that. Basically, a heavier lock is needed because of page deletion by VACUUM, which is the major source of complexity (much more than page splits, I'd say). In general, the key space can be consolidated by VACUUM in a way that breaks child page checking because the downlink we followed from our target page is no longer the current downlink. Page deletion deletes the right sibling's downlink, and the deleted page's downlink is used for its sibling. You could have a race, where there was a concurrent page deletion of the left sibling of the child page, then a concurrent insertion into the newly expanded keyspace of the parent. Therefore, the downlink in the parent (which is the "target", to use the patch's terminology) would not be a lower bound on items in the page. That's a very low probability race, because it involves deletion a page representing a range in the keyspace that has no live items, but then immediately getting one just in time for our check. But, I'm pretty determined that false positives like that need to be impossible (or else it's a bug). I have a paranoid feeling that there is a similar very low probability race with the left-right keyspace checks, which don't have relation ExclusiveLock protection (IOW, I think that that might be buggy). I need to think about that some more, but current thinking is that it would hardly matter if we used the highkey from right page rather than the first data item, which definitely would be correct. And it would be simpler. > A few minor comments: Thanks for catching those issues. Will fix. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (firstname.lastname@example.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers