On 21 June 2018 at 01:15, Andres Freund <and...@anarazel.de> wrote:

> On 2018-06-20 13:10:57 -0400, Robert Haas wrote:
> > On Wed, Jun 20, 2018 at 12:10 PM, Andres Freund <and...@anarazel.de>
> wrote:
> > > If we instead had a backtrace enabled for all PANICs and some FATALs by
> > > default (and perhaps a SIGSEGV handler too), we'd be in a better
> > > place. That'd obviously only work when compiled with support for
> > > libraries, on platforms where we added support for that. But I think
> > > that'd be quite the improvement already.
> >
> > I think doing it on PANIC would be phenomenal.  SIGSEGV would be great
> > if we can make it safe enough, which I'm not sure about, but then I
> > suppose we're crashing anyway.
>
> Yea, I think that's pretty much why It'd be ok.


Yep. At worst we crash again while trying to generate a bt. We're not doing
anything particularly exciting, and it should be sensible enough.

> Instead of making the ERROR behavior conditional on
> > log_error_verbosity as Craig has it now, how about doing it whenever
> > the error code is ERRCODE_INTERNAL_ERROR?  That's pretty much the
> > cases that aren't supposed to happen, so if we see those happening a
> > lot, it's either a bug we need to fix or we should supply a better
> > error code.  Also, a lot of those messages are duplicated in many
> > places and/or occur inside fairly generic functions inside
> > lsyscache.c, so the actual error message text tends not to be enough
> > to know what happened.
>
> I don't think that's ok. It's perfectly possible to hit
> ERRCODE_INTERNAL_ERROR at a high frequency in some situations, and
> there's plenty cases that aren't ERRCODE_INTERNAL_ERROR where we'd want
> this.


Perhaps we should fix those, but it might be a game of whack-a-mole as the
code changes, and inevitably you'll want to generate stacks for some other
errcode while getting frustrated at all the ERRCODE_INTERNAL_ERROR. Not
sure it's worth it.

However, maybe a  GUC like

log_error_stacks_errcodes = 'XX000, 55000'

would work. It'd be somewhat expensive to evaluate, but we'd only be doing
it where we'd already decided to emit an error. And it'd fit in even if we
later added smarter selective logging.

BTW, it's worth noting that these backtraces are very limited. They don't
report arguments or locals. So it's still no substitute for suitable
errcontext callbacks, and sometimes it's still necessary to fall back to
gdb or messing around with perf userspace tracepoints.

-- 
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Reply via email to