On 03/17/2016 10:14 PM, Fabien COELHO wrote:

I would have suggested using the --latency-limit option to filter out
very slow queries, otherwise if the system is stuck it may catch up
later, but then this is not representative of "sustainable" performance.

When pgbench is running under a target rate, in both runs the
transaction distribution is expected to be the same, around 5000 tps,
and the green run looks pretty ok with respect to that. The magenta one
shows that about 25% of the time, things are not good at all, and the
higher figures just show the catching up, which is not really
interesting if you asked for a web page and it is finally delivered 1
minutes later.

Maybe. But that'd only increase the stress on the system, possibly
causing more issues, no? And the magenta line is the old code, thus it
would only increase the improvement of the new code.

Yes and no. I agree that it stresses the system a little more, but
the fact that you have 5000 tps in the end does not show that you can
really sustain 5000 tps with reasonnable latency. I find this later
information more interesting than knowing that you can get 5000 tps
on average, thanks to some catching up. Moreover the non throttled
runs already shown that the system could do 8000 tps, so the
bandwidth is already  there.

Sure, but thanks to the tps charts we *do know* that for vast majority of the intervals (each second) the number of completed transactions is very close to 5000. And that wouldn't be possible if large part of the latencies were close to the maximums.

With 5000 tps and 32 clients, that means the average latency should be less than 6ms, otherwise the clients couldn't make ~160 tps each. But we do see that the maximum latency for most intervals is way higher. Only ~10% of the intervals have max latency below 10ms, for example.

Notice the max latency is in microseconds (as logged by pgbench),
so according to the "max latency" charts the latencies are below
10 seconds (old) and 1 second (new) about 99% of the time.

AFAICS, the max latency is aggregated by second, but then it does
not say much about the distribution of individuals latencies in the
interval, that is whether they were all close to the max or not,
Having the same chart with median or average might help. Also, with
the stddev chart, the percent do not correspond with the latency one,
so it may be that the latency is high but the stddev is low, i.e. all
transactions are equally bad on the interval, or not.
So I must admit that I'm not clear at all how to interpret the max
latency & stddev charts you provided.

You're right those charts are not describing distributions of the latencies but those aggregated metrics. And it's not particularly simple to deduce information about the source statistics, for example because all the intervals have the same "weight" although the number of transactions that completed in each interval may be different.

But I do think it's a very useful tool when it comes to measuring the consistency of behavior over time, assuming you're asking questions about the intervals and not the original transactions.

For example, had there been intervals with vastly different transaction rates, we'd see that on the tps charts (i.e. the chart would be much more gradual or wobbly, just like the "unpatched" one). Or if there were intervals with much higher variance of latencies, we'd see that on the STDDEV chart.

I'll consider repeating the benchmark and logging some reasonable sample of transactions - for the 24h run the unthrottled benchmark did ~670M transactions. Assuming ~30B per line, that's ~20GB, so 5% sample should be ~1GB of data, which I think is enough.

But of course, that's useful for answering questions about distribution of the individual latencies in global, not about consistency over time.

So I don't think this would make any measurable difference in practice.

I think that it may show that 25% of the time the system could not
match the target tps, even if it can handle much more on average, so
the tps achieved when discarding late transactions would be under
4000 tps.

You mean the 'throttled-tps' chart? Yes, that one shows that without the patches, there's a lot of intervals where the tps was much lower - presumably due to a lot of slow transactions.


Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to