On Tue, Mar 15, 2016 at 12:57 AM, Robert Haas <robertmh...@gmail.com> wrote:

> On Mon, Mar 14, 2016 at 4:42 PM, Andres Freund <and...@anarazel.de> wrote:
> > On 2016-03-14 16:16:43 -0400, Robert Haas wrote:
> >> > I have already shown [0, 1] the overhead of measuring timings in
> linux on
> >> > representative workload. AFAIK, these tests were the only one that
> showed
> >> > any numbers. All other statements about terrible performance have
> been and
> >> > remain unconfirmed.
> >>
> >> Of course, those numbers are substantial regressions which would
> >> likely make it impractical to turn this on on a heavily-loaded
> >> production system.
> >
> > A lot of people operating production systems are fine with trading a <=
> > 10% impact for more insight into the system; especially if that
> > configuration can be changed without a restart.  I know a lot of systems
> > that use pg_stat_statements, track_io_timing = on, etc; just to get
> > that. In fact there's people running perf more or less continuously in
> > production environments; just to get more insight.
> >
> > I think it's important to get as much information out there without
> > performance overhead, so it can be enabled by default. But I don't think
> > it makes sense to not allow features in that cannot be enabled by
> > default, *if* we tried to make them cheap enough beforehand.
> Hmm, OK.  I would have expected you to be on the other side of this
> question, so maybe I'm all wet.  One point I am concerned about is
> that, right now, we have only a handful of types of wait events.  I'm
> very interested in seeing us add more, like I/O and client wait.  So
> any overhead we pay here is likely to eventually be paid in a lot of
> places - thus it had better be extremely small.

OK. Let's start to produce light, not heat.

As I get we have two features which we suspect to introduce overhead:
1) Recording parameters of wait events which requires some kind of
synchronization protocol.
2) Recording time of wait events because time measurements might be
expensive on some platforms.

Simultaneously there are machines and workloads where both of these
features doesn't produce measurable overhead.  And, we're talking not about
toy databases. Vladimir is DBA from Yandex which is in TOP-20 (by traffic)
internet companies in the world.  They do run both of this features in
production highload database without noticing any overhead of them.

It would be great progress, if we decide that we could add both of these
features controlled by GUC (off by default).

If we decide so, then let's start working on this. At first, we should
construct list of machines and workloads for testing. Each list of machines
and workloads would be not comprehensive. But let's find something that
would be enough for testing of GUC controlled, off by default features.
Then we can turn our conversation from theoretical thoughts to particular
benchmarks which would be objective and convincing to everybody.

Otherwise, let's just add these features to the list of unwanted
functionality and close this question.

Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Reply via email to