Hello Tom,

Meh.  A progress-reporting feature has use when the tool is working
towards completion of a clearly defined task.  In the case of pgbench,
if you told it to run for -T 60 seconds rather than -T 10 seconds,
that's probably because you don't trust a 10-second average to be
sufficiently reproducible.

The motivation for the progress options are:

(1) to check for (not blindly trust) the performance stability, especially as warming up time can be very long. See for instance my blog post:

        http://blog.coelho.net/database/2013/08/14/postgresql-warmup/

a scaled 100 read-only pgbench run on a standard HDD requires 18 minutes to reach the performance steady state, and the performance is multiplied by 120 along the way, mostly in the last 2 minutes. In my experience 10 and 60 seconds running period are equally ridiculously short running times for real benchmarks. When I am running a bench for 30 minutes, I like to have some output before the end of the command to know what is going on.

(2) when reporting performance figures, benchmark rules usually require that the detailed performance during the whole run are also reported, not just the final average, so as to rule out warming up or other unexpected and transitional effects.

(3) another use case of the option is to run with --rate (to target some tps you expect on your system) and then to run other commands in parallel (say pg_dump, pg_basebackup...) to check the impact it has on performance.

I do agree that having report every second on a 10 second run is not very useful, but that is not the use case.

So I'm not real sure that reporting averages over shorter intervals is all that useful; especially not if it takes cycles out of pgbench, which itself is often a bottleneck.

If you do not ask for it, it does not harm the performance significantly.

I could see some value in a feature that computed shorter-interval TPS
averages and then did some further arithmetic, like measuring the standard
deviation of the shorter-interval averages to assess how much noise there
will be in the full-run average.

I do not understand. "pgbench -P" does report the standard deviation as well as the client side latency. Without this option pgbench is a black box.

But that's not what this does, and if it did do that, "reporting progress" would not be what I'd see as its main purpose.

This is for benchmarking. It is really reporting progress towards performance steady state, not reporting progress towards task completion.
Maybe a better name could have been thought for.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to