On Sat, Mar 12, 2016 at 2:45 AM, Andres Freund <and...@anarazel.de> wrote:
> On 2016-03-12 02:24:33 +0300, Alexander Korotkov wrote:
> > Idea of individual time measurement of every wait event met criticism
> > because it might have high overhead .
> Right. And that's actually one of the point which I meant with "didn't
> listen to criticism". There've been a lot of examples, on an off list,
> where taking timings trigger significant slowdowns. Yes, in some
> bare-metal environments, which a coherent tsc, the overhead can be
> low. But that doesn't make it ok to have a high overhead on a lot of
> other systems.
> Just claiming that that's not a problem will only lead to your position
> not being taken serious.
> > This is really so at least for Windows .
> Measuring timing overhead for a simplistic workload on a single system
> doesn't mean that. Try doing such a test on a vmware esx virtualized
> windows machine, on a multi-socket server; in a lot of instances you'll
> see two-three orders of magnitude longer average times; with peaks going
> into 4-5 orders of magnitude. And, as sad it is, realistically most
> postgres instances will run in virtualized environments.
> > But accessing only current values wouldn't be very useful. We
> > anyway need to gather some statistics. Gathering it by sampling would be
> > both more expensive and less accurate for majority of systems. This is
> > I proposed hooks to make possible platform dependent extensions. Robert
> > rejects hook because he is "not a big fan of hooks as a way of resolving
> > disagreements about the design" .
> I think I agree with Robert here. Providing hooks into very low level
> places tends to lead to problems in my experience; tight control over
> what happens is often important - I certainly don't want any external
> code to run while we're waiting for an lwlock.
So, I get following.
1) Detailed wait monitoring might cause high overhead on some systems.
2) We want wait monitoring to be always on. And we don't want options to
enable additional features of wait monitoring.
3) We don't want hook of wait events to be exposed.
Can I conclude that we reject detailed wait monitoring by design?
If it's so and not only Robert thinks so, then let's just admit it and add
it to FAQ and etc.
> Besides that is actually not design issues but platform issues...
> I don't see how that's the case.
> > Another question is wait parameters. We want to expose wait event with
> > some parameters. Robert rejects that because it *might* add additional
> > overhead . When I proposed to fit something useful into hard-won
> > 4-bytes, Roberts claims that it is "too clever" .
> I think stopping to treat this as "Robert/EDB vs. pgpro" would be a good
> first step to make progress here.
> It seems entirely possible to extend the current API in an incremental
> fashion, either allowing to disable the individual pieces, or providing
> sufficient measurements that it's not needed.
> > So, situation looks like dead-end. I have no idea how to convince Robert
> > about any kind of advanced functionality of wait monitoring to
> > I'm thinking about implementing sampling extension over current
> > infrastructure just to make community see that it sucks. Andres, it would
> > be very nice if you have any idea how to move this situation forward.
> I've had my share of conflicts with Robert. But if I were in his shoes,
> targeted by this kind of rhetoric, I'd be very tempted to just ignore
> any further arguments from the origin. So I think the way forward is
> for everyone to cool off, and to see how we can incrementally make
> progress from here on.
> > Another aspect is that EnterpriseDB offers waits monitoring in
> > fork .
So, we'll end up with every company providing fork with detailed wait
monitoring. While community PostgreSQL resists from providing such
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company