On Tue, Jan 1, 2019 at 2:11 AM Andrey Borodin <[email protected]> wrote:
> > This isn't consistent with what you do within verify_nbtree.c, which
> > deliberately avoids ever holding more than a single buffer lock at a
> > time, on general principle. That isn't necessarily a reason why you
> > have to do the same, but it's not clear why you do things that way.
> > Why isn't it enough to have a ShareLock on the relation? Maybe this is
> > a sign that it would be a good idea to always operate on a palloc()'d
> > copy of the page, by introducing something equivalent to
> > palloc_btree_page(). (That would also be an opportunity to do very
> > basic checks on every page.)
> If we unlock parent page before checking child, some insert can adjust tuple
> on parent, sneak into child and insert new tuple.
> This can trigger false positive. I'll think about it more.
I think that holding a buffer lock on an internal pages for as long as
it takes to check all of the child pages is a non-starter. If you
can't think of a way of not doing that that's race free with a
relation-level AccessShareLock, then a relation-level ShareLock (which
will block VACUUM) seems necessary.
I think that you should duplicate some of what's in
bt_index_check_internal() -- lock the heap table as well as the index,
to avoid deadlocks. We might not do this with contrib/pageinspect, but
that's not really intended to be used in production in the same way
this will be.
> > * Maybe gist_index_check() should be called gist_index_parent_check(),
> > since it's rather like the existing verification function
> > bt_index_parent_check().
> >
> > * Alternatively, you could find a way to make your function only need
> > an AccessShareLock -- that would make gist_index_check() an
> > appropriate name. That would probably require careful thought about
> > VACUUM.
>
> I've changed lock level to AccessShareLock. IMV scan is just as safe as
> regular GiST index scan.
> There is my patch with VACUUM on CF, it can add deleted pages. I'll update
> one of these two patches accordingly, if other is committed.
Maybe you should have renamed it to gist_index_parent_check() instead,
and kept the ShareLock. I don't think that this business with buffer
locks is acceptable. And it is mostly checking parent/child
relationships. Right?
You're not really able to check GiST tuples against anything other
than their parent, unlike with B-Tree (you can only do very simple
things, like the IndexTupleSize()/lp_len cross-check). Naming the
function gist_index_parent_check() seems like the right way to go,
even if you could get away with an AccessShareLock (which I now tend
to doubt). It was way too optimistic to suppose that there might be a
clever way of locking down race conditions that allowed you to not
couple buffer locks, but also use an AccessShareLock. After all, even
the B-Tree amcheck code doesn't manage to do this when verifying
parent/child relationships (it only does something clever with the
sibling page cross-check).
> > * Why is it okay to do this?:
> >
> >> + if (GistTupleIsInvalid(idxtuple))
> >> + ereport(LOG,
> >> + (errmsg("index \"%s\" contains an inner tuple marked
> >> as invalid",
> >> + RelationGetRelationName(rel)),
> >> + errdetail("This is caused by an incomplete page split
> >> at crash recovery before upgrading to PostgreSQL 9.1."),
> >> + errhint("Please REINDEX it.")));
> >
> > You should probably mention the gistdoinsert() precedent for this.
> This invalid tuple will break inserts, but will not affect select. I do not
> know, should this be error or warning in amcheck?
It should be an error. Breaking inserts is a serious problem.
--
Peter Geoghegan