On Fri, Oct 8, 2021 at 11:40 PM Bruce Momjian <br...@momjian.us> wrote:
>
> On Fri, Oct  8, 2021 at 05:28:37PM +0200, Thomas Kellerer wrote:
> >
> > We typically use the AWR reports as a post-mortem analysis tool if
> > something goes wrong in our application (=customer specific projects)
> >
> > E.g. if there was a slowdown "last monday" or "saving something took 
> > minutes yesterday morning",
> > then we usually request an AWR report from the time span in question. Quite 
> > frequently
> > this already reveals the culprit. If not, we ask them to poke in more 
> > detail into v$session_history.
> >
> > So in our case it's not really used for active monitoring, but for
> > finding the root cause after the fact.
> >
> > I don't know how representative this usage is though.
>
> OK, that's a good usecase, and something that certainly would apply to
> Postgres.  Don't you often need more than just wait events to find the
> cause, like system memory usage, total I/O, etc?

You usually need a variety of metrics to be able to find what is
actually causing $random_incident, so the more you can aggregate in
your performance tool the better.  Wait events are an important piece
of that puzzle.

As a quick example for wait events, I recently had to diagnose some
performance issue, which turned out to be some process reaching the 64
subtransactions with the well known consequences.  I had
pg_wait_sampling aggregated metrics available so it was really easy to
know that the slowdown was due to that.  Knowing what application
exactly reached those 64 subtransactions is another story.


Reply via email to