On Fri, Sep 23, 2016 at 7:17 AM, Tomas Vondra
> On 09/23/2016 03:20 AM, Robert Haas wrote:
>> On Thu, Sep 22, 2016 at 7:44 PM, Tomas Vondra
>> <tomas.von...@2ndquadrant.com> wrote:
>>> I don't dare to suggest rejecting the patch, but I don't see how
>>> we could commit any of the patches at this point. So perhaps
>>> "returned with feedback" and resubmitting in the next CF (along
>>> with analysis of improvedworkloads) would be appropriate.
>> I think it would be useful to have some kind of theoretical analysis
>> of how much time we're spending waiting for various locks. So, for
>> example, suppose we one run of these tests with various client
>> counts - say, 1, 8, 16, 32, 64, 96, 128, 192, 256 - and we run
>> "select wait_event from pg_stat_activity" once per second throughout
>> the test. Then we see how many times we get each wait event,
>> including NULL (no wait event). Now, from this, we can compute the
>> approximate percentage of time we're spending waiting on
>> CLogControlLock and every other lock, too, as well as the percentage
>> of time we're not waiting for lock. That, it seems to me, would give
>> us a pretty clear idea what the maximum benefit we could hope for
>> from reducing contention on any given lock might be.
> Yeah, I think that might be a good way to analyze the locks in general, not
> just got these patches. 24h run with per-second samples should give us about
> 86400 samples (well, multiplied by number of clients), which is probably
> good enough.
> We also have LWLOCK_STATS, that might be interesting too, but I'm not sure
> how much it affects the behavior (and AFAIK it also only dumps the data to
> the server log).
Right, I think LWLOCK_STATS give us the count of how many time we have
blocked due to particular lock like below where *blk* gives that
PID 164692 lwlock main 11: shacq 2734189 exacq 146304 blk 73808
spindelay 73 dequeue self 57241
I think doing some experiments with both the techniques can help us to
take a call on these patches.
Do we want these experiments on different kernel versions or are we
okay with the current version on cthulhu (3.10) or we want to only
consider the results with latest kernel?
>> Now, we could also try that experiment with various patches. If we
>> can show that some patch reduces CLogControlLock contention without
>> increasing TPS, they might still be worth committing for that
>> reason. Otherwise, you could have a chicken-and-egg problem. If
>> reducing contention on A doesn't help TPS because of lock B and
>> visca-versa, then does that mean we can never commit any patch to
>> reduce contention on either lock? Hopefully not. But I agree with you
>> that there's certainly not enough evidence to commit any of these
>> patches now. To my mind, these numbers aren't convincing.
> Yes, the chicken-and-egg problem is why the tests were done with unlogged
> tables (to work around the WAL lock).
Yeah, but I suspect still there was a impact due to ProcArrayLock.
Sent via pgsql-hackers mailing list (email@example.com)
To make changes to your subscription: