Right now when you run pgbench, the results vary considerably from run to
run even if you completely rebuild the database every time. I've found
that a lot of that variation comes from two things:
The main purpose of pgbench runs is an "apples to apples" comparison of 2
source bases. One pristine Postgresql source base and another base being the
same source patched with supposed enhancements.
As long as we use the same postgresql.conf, same hardware environment and
exactly same parameter pgbench runs, the difference in the TPS values
observed between the 2 sources should be a good enough indicator as to the
viability of the new code, dont you think?
E.g. autovacuum will trigger on certain tables only if the threshold is over
the limit. So that gets tied in to the update rate. The "shared_buffers"
will become a bottleneck only if the code and the run is I/O intensive
IMHO, as long as the same environment holds true for both the source base
runs, we should not see unexplained variations as per the reasons you have
mentioned in the observed TPS values.