On Fri, Aug 11, 2023 at 11:23 AM Andres Freund <and...@anarazel.de> wrote:
> > Couldn't you say the same thing about defensive "can't happen" ERRORs?
> > They are essentially a form of assertion that isn't limited to
> > assert-enabled builds.
>
> Yes. A lot of them I hate them with the passion of a thousand suns ;). "Oh,
> our transaction state machinery is confused. Yes, let's just continue going
> through the same machinery again, that'll resolve it.".

I am not unsympathetic to Ashutosh's point about conventional ERRORs
being easier to deal with when debugging your own code, during initial
development work. But that seems like a problem with the tooling in
other areas.

For example, dealing with core dumps left behind by the regression
tests can be annoying. Don't you also hate it when there's a
regression.diffs that just shows 20k lines of subtractions? Perhaps
you don't -- perhaps your custom setup makes it quick and easy to get
relevant information about what actually went wrong. But it seems like
that sort of thing could be easier to deal with by default, without
using custom shell scripts or anything -- particularly for those of us
that haven't been Postgres hackers for eons.

> There've been people arguing in the past that it's good for robustness to do
> stuff like that. I think it's exactly the opposite.
>
> Now, don't get me wrong, there are plenty cases where a "this shouldn't
> happen" elog(ERROR) is appropriate...

Right. They're not bad -- they just need to be backed up by some kind
of reasoning, which will be particular to each case. The default
approach should be to crash whenever any invariants are violated,
because all bets are off at that point.

> What are you imagining? Basically something that generates an elog(ERROR) with
> the stringified expression as part of the error message?

I'd probably start with a new elevel, that implied an assertion
failure in assert-enabled builds but otherwise acted just like ERROR.
I remember multiple cases where I added an "Assert(false)" right after
a "can't happen" error, which isn't a great approach.

It might also be useful to offer something along the lines you've
described, which I was sort of thinking of myself. But now that I've
thought about it a little more, I think that such an approach might
end up being overused. If you're going to add what amounts to a "can't
happen" ERROR then you should really be obligated to write a halfway
reasonable error message. As I said, you should have to own the fact
that you think it's better to not just crash for this one particular
callsite, based on some fairly specific chain of reasoning. Ideally
it'll be backed up by practical experience/user reports.

-- 
Peter Geoghegan


Reply via email to