Re: [HACKERS] pg_stat_lwlocks view - lwlocks statistics, round 2

Robert Haas Fri, 19 Oct 2012 11:03:43 -0700

On Fri, Oct 19, 2012 at 11:52 AM, Satoshi Nagayasu <sn...@uptime.jp> wrote:
> I agree with that such performance instrument needs to be
> improved if it has critical performance issue against
> production environment. So, I'm still looking for a better
> implementation to decrease performance impact.


Frankly, I think the approach of simply providing an "off" switch is
probably a good one.  I admit that it would be nice if we could run
with instrumentation like this all the time - but until very fast
time-lookups become ubiquitous, we can't

Another architectural problem here is that I believe this will
increase the size of the stats file, which I think is going to cause
pain for some people.  I suspect that's going to be an issue even if
we have an "off" switch.  I think somebody's going to have to figure
out a way to split that file up somehow.

> However, the most important question here is that "How can
> we understand postgresql behavior without looking into
> tons of source code and hacking skill?"
>
> Recently, I've been having lots of conversation with
> database specialists (sales guys and DBAs) trying to use
> PostgreSQL instead of a commercial database, and they are
> always struggling with understand PostgreSQL behavior,
> because no one can observe and/or tell that.

Agreed.

>>  Sadly, the situation on Windows doesn't look so good.  I
>> don't remember the exact numbers but I think it was something like 40
>> or 60 or 80 times slower on the Windows box one of my colleagues
>> tested than it is on Linux.  And it turns out that that overhead
>> really is measurable and does matter if you do it in a code path that
>> gets run frequently.  Of course I am enough of a Linux geek that I
>> don't use Windows myself and curse my fate when I do have to use it,
>> but the reality is that we have a huge base of users who only use
>> PostgreSQL at all because it runs on Windows, and we can't just throw
>> those people under the bus.  I think that older platforms like HP/UX
>> likely have problems in this area as well although I confess to not
>> having tested.
>
> Do you mean my stat patch should have more performance test
> on the other platforms? Yes, I agree with that.

Yes.

>> That having been said, if we're going to do this, this is probably the
>> right approach, because it only calls gettimeofday() in the case where
>> the lock acquisition is contended, and that is a lot cheaper than
>> calling it in all cases.  Maybe it's worth finding a platform where
>> pg_test_timing reports that timing is very slow and then measuring how
>> much impact this has on something like a pgbench or pgbench -S
>> workload.  We might find that it is in fact negligible.  I'm pretty
>> certain that it will be almost if not entirely negligible on Linux but
>> that's not really the case we need to worry about.
>
> Thanks for a suggestion for a better implementation.
> As I mentioned above, I'm always looking for a better idea
> and solution to meet our purpose.

Actually, I meant that your existing implementation seemed to be
making some good decisions.

> Here, I want to share my concern with you again.
> PostgreSQL is getting more complicated in order to improve
> performance and stability, and I think it may be a good news.
> But also getting more difficult to understand without deep
> knowledge nowadays, and that would be some bad news actually.
>
> From my point of view, that's a huge hurdle to educate DBAs
> and expand PostgreSQL user base.

Yes, there is definitely more work to be done, here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_stat_lwlocks view - lwlocks statistics, round 2

Reply via email to