Re: pgbench - implement strict TPC-B benchmark

Fabien COELHO Mon, 05 Aug 2019 13:46:34 -0700


Hello Andres,

Which is a (somehow disappointing) * 3.3 speedup. The impact on the 3
complex expressions tests is not measurable, though.


I don't know why that could be disappointing. We put in much more work
for much smaller gains in other places.

Probably, but I thought I would have a better deal by eliminating moststring stuff from variables.

Questions:
 - how likely is such a patch to pass? (IMHO not likely)


I don't see why? I didn't review the patch in any detail, but it didn't
look crazy in quick skim? Increasing how much load can be simulated
using pgbench, is something I personally find much more interesting than
adding capabilities that very few people will ever use.

Yep, but my point is that the bottleneck is mostly libpq/system, as Itried to demonstrate with the few experiments I reported.

FWIW, the areas I find current pgbench "most lacking" during development
work are:

1) Data load speed. The data creation is bottlenecked on fprintf in a
  single process.


snprintf actually, could be replaced.

I submitted a patch to add more control on initialization, including aserver-side loading feature, i.e. the client does not send data, theserver generates its own, see 'G':


        https://commitfest.postgresql.org/24/2086/

However on my laptop it is slower than client-side loading on a localsocket. The client version is doing around 70 MB/s, the client load is20-30%, postgres load is 85%, but I'm not sure I can hope for much more onmy SSD. On my laptop the bottleneck is postgres/disk, not fprintf.

The index builds are done serially. The vacuum could be replaced by COPYFREEZE.


Well, it could be added?

For a lot of meaningful tests one needs 10-1000s of GB of testdata -creating that is pretty painful.


Yep.

2) Lack of proper initialization integration for custom
  scripts.


Hmmm…

You can always write a psql script for schema and possibly simplistic datainitialization?

However, generating meaningful pseudo-random data for an arbitrary schemais a pain. I did an external tool for that a few years ago:


        http://www.coelho.net/datafiller.html

but it is still a pain.

I.e. have steps that are in the custom script that allow -i, vacuum, etcto be part of the script, rather than separately executable steps.--init-steps doesn't do anything for that.


Sure. It just gives some control.

3) pgbench overhead, although that's to a significant degree libpq's fault


I'm afraid that is currently the case.

4) Ability to cancel pgbench and get approximate results. That currently
  works if the server kicks out the clients, but not when interrupting
  pgbench - which is just plain weird.  Obviously that doesn't matter
  for "proper" benchmark runs, but often during development, it's
  enough to run pgbench past some events (say the next checkpoint).


Do you mean have a report anyway on "Ctrl-C"?

I usually do a -P 1 to see the progress, but making Ctrl-C work should bereasonably easy.

 - what is its impact to overall performance when actual queries
   are performed (IMHO very small).


Obviously not huge - I'd also not expect it to be unobservably small
either.

Hmmm… Indeed, the 20 \set script runs at 2.6 M/s, that is 0.019 µs per\set, and any discussion over the connection is at least 15 µs (for oneclient on a local socket).


--
Fabien.

Re: pgbench - implement strict TPC-B benchmark

Reply via email to