Re: new heapcheck contrib module

Peter Geoghegan Wed, 13 May 2020 15:30:25 -0700

On Wed, May 13, 2020 at 3:10 PM Alvaro Herrera <[email protected]> wrote:
> Hmm.  I think we should (try to?) write code that avoids all crashes
> with production builds, but not extend that to assertion failures.


Assertions are only a problem at all because Mark would like to write
tests that involve a selection of truly corrupt data. That's a new
requirement, and one that I have my doubts about.

> > I'll stick with your example. You're calling
> > TransactionIdDidCommit() from check_tuphdr_xids(), which will
> > interrogate the commit log and pg_subtrans. It's just not under your
> > control.
>
> in a production build this would just fail with an error that the
> pg_xact file cannot be found, which is fine -- if this happens in a
> production system, you're not disturbing any other sessions.  Or maybe
> the file is there and the byte can be read, in which case you would get
> the correct response; but that's fine too.

I think that this is fine, too, since I don't consider assertion
failures with corrupt data all that important. I'd make some effort to
avoid it, but not too much, and not at the expense of a useful general
purpose assertion that could catch bugs in many different contexts.

I would be willing to make a larger effort to avoid crashing a
backend, since that affects production. I might go to some effort to
not crash with downright adversarial inputs, for example. But it seems
inappropriate to take extreme measures just to avoid a crash with
extremely contrived inputs that will probably never occur. My sense is
that this is subject to sharply diminishing returns. Completely
nailing down hard crashes from corrupt data seems like the wrong
priority, at the very least. Pursuing that objective over other
objectives sounds like zero-risk bias.

-- 
Peter Geoghegan

Re: new heapcheck contrib module

Reply via email to