Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

Greg Smith Wed, 17 Jul 2013 19:59:04 -0700

On 7/17/13 9:16 PM, Tatsuo Ishii wrote:

Now suppose we have 3 transactions and each has following values:


d(0) = 10
d(1) = 20
d(2) = 30

t(0) = 100
t(1) = 110
t(2) = 120

That says pgbench expects the duration 10 for each
transaction. Actually, the first transaction runs slowly for some
reason and the lag = 100 - 10 = 90. However, tx(1) and tx(2) are
finished on schedule because they spend only 10 (110-10 = 10, 120-110
= 10). So the expected average lag would be 90/3 = 30.

The clients are not serialized here in any significant way, even whenthey shared a single process/thread. I did many rounds of tracingthrough this code with timestamps on each line, and the sequence ofevents here will look like this:


client 0:  send "SELECT..." to server.  yield to next client.
client 1:  send "SELECT..." to server.  yield to next client.
client 2:  send "SELECT..." to server.  yield to next client.
select():  wait for the first response from any client.
client 0:  receive response.  complete transaction, compute lag.
client 1:  receive response.  complete transaction, compute lag.
client 2:  receive response.  complete transaction, compute lag.

There is nothing here that is queuing the clients one after the other.If (0) takes 100ms before its reply comes back, (1) and (2) can receivetheir reply back and continue forward at any time. They are not waitingfor (0); it has yielded control while waiting for a response. All threetimes are independent once you reach the select() point where all areactive.

In this situation, if the server gets stuck doing something such that ittakes 100ms before any client receives a response, it is correct topenalize every client for that latency. All three clients could havereceived the information earlier if the server had any to send them. Ifthey did not, they all were suffering from some sort of lag.

I'm not even sure why you spaced the start times out at intervals of 10.If I were constructing an example like this, I'd have them start attimes of 0, 1, 2--as fast as the CPU can fire off statementsbasically--and then start waiting from that point. If client 1 takes 10units of time to send its query out before client 2 runs, and the rategoal requires 10 units of time, the rate you're asking for is impossible.

For sorting out what's going on with your two systems, I would recommendturning on debugging output with "-d" and looking at the newper-transaction latency numbers that the feature reports. If yourtheory that the lag is going up as the test proceeds is true, thatshould show up in the individual latency numbers too.

Based on what I saw during weeks of testing here, I would be moresuspicious that there's a system level difference between your twoservers than to blame the latency calculation. I saw a *lot* of weirdsystem issues myself when I started looking that carefully at sustainedthroughput. The latency reports from the perspective of Fabien's codewere always reasonable though. When something delays every client, itcounts that against every active client's lag, and that's the rightthing to do.


--
Greg Smith   2ndQuadrant US    g...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

Reply via email to