Re: [HACKERS] PATCH: pgbench - merging transaction logs

Fabien COELHO Sat, 28 Mar 2015 03:30:08 -0700


Hello Tomas,

I do agree that fprintf is not cheap, actually when profiling pgbench
it's often the #1 item, but the impact on the measurements is actually
quite small. For example with a small database (scale 10) and read-only
30-second runs (single client), I get this:

  no logging: 18672 18792 18667 18518 18613 18547
with logging: 18170 18093 18162 18273 18307 18234

So on average, that's 18634 vs. 18206, i.e. less than 2.5% difference.
And with more expensive transactions (larger scale, writes, ...) the
difference will be much smaller.

I did some testing with a scale 10 prepared "SELECT only" 200 secondsplenty of runs with local socket connections on the largest host I haveavailable:


  pgbench -P 10 -T 200 -S -M prepared -j $c -c $c ...

I think that this cpu-bound bench is kind of a worst possible case for thedetailed per transaction log.

I also implemented a quick and dirty version for a merge log based onsharing a file handle (append mode + sprintf + fputs).


The results are as follow:

 * 1 thread 33 runs median tps (average is consistent):
 - no logging:        22062
 - separate logging:  19360  (-12.2%)
 - merged logging:    19326  (-12.4%, not significant from previous)

Note that the impact of logging is much larger than with your tests.
The underlying fprintf comes from gnu libc 2.19.

The worst overhead I could trigger is with 12 threads:

 * 12 threads 35 runs median tps (average is consistent)
 - no logging:       155917
 - separate logging: 124695 (-20.0%)
 - merged logging:   119349 (-23.5%)

My conclusion from these figures is that although the direct mergedlogging approach adds some overhead, this overhead is small wrt detailedlogging (it adds 3.5% to a 20% logging overhead) with 12 threads. Othertests, even with more threads, did not yield larger absolute or relativeoverheads. Although the direct merge approach is shown to add overheads,this is a small additional overhead on a quite bad situation already,which suggest that using detailed log on a cpu-bound pgbench run is a badidea to begin with.

For a more realistic test, I ran "simple updates" which involve actualdisk writes. It ran at around 840 tps with 24 threads. The loggingoverhead seems to be under 1%, and there is no significant differencebetween separate and merged on the 20 runs.


So my overall conclusion is:

(1) The simple thread-shared file approach would save pgbench frompost-processing merge-sort heavy code, for a reasonable cost.

(2) The feature would not be available for the thread-emulation with thisapproach, but I do not see this as a particular issue as I think that itis pretty much only dead code and a maintenance burden.

(3) Optimizing doLog from its current fprintf-based implementation may bea good thing.


--
Fabien.


--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PATCH: pgbench - merging transaction logs

Reply via email to