Re: [HACKERS] pgbench small bug fix

Fabien COELHO Sun, 07 Feb 2016 00:14:42 -0800


<Ooops, resent, wrong "From" again... sorry :-( >


Hello Robert,

> While testing for something else I encountered two small bugs under very low
> rate (--rate=0.1). The attached patches fixes these.

>> - when a duration (-T) is specified, ensure that pgbench ends at that

>    time (i.e. do not wait for a transaction beyond the end of the run).

Why does this use INSTR_TIME_GET_DOUBLE() and not INSTR_TIME_GET_MICROSEC()?


I choose that because duration is in seconds, but MICROSEC would work fine as
well. See attached version.

Also, why do we really need this change?  Won't the timer expiration
stop the thread at the right time anyway?


Not always. When several threads are used, the time expiration in non-zero
threads is currently triggered after the last schedule transaction, even if this
transaction is long after the expiration, which means it runs overtime:

 sh> time pgbench -T 5 -R 0.1 -P 1 -c 2 -j 2
 starting vacuum...end.
 progress: 1.0 s, 0.0 tps, lat -nan ms stddev -nan, lag -nan ms
 progress: 2.0 s, 1.0 tps, lat 12.129 ms stddev 0.000, lag 1.335 ms
 progress: 3.0 s, 0.0 tps, lat -nan ms stddev -nan, lag -nan ms
 progress: 4.0 s, 0.0 tps, lat -nan ms stddev -nan, lag -nan ms
 progress: 5.0 s, 0.0 tps, lat -nan ms stddev -nan, lag -nan ms
 < 6 seconds wait for a schedule transaction in thread 1
   no report because thread 0 expired...>
 transaction type: TPC-B (sort of)
 scaling factor: 1
 query mode: simple
 number of clients: 2
 number of threads: 2
 duration: 5 s
 number of transactions actually processed: 2
 latency average: 14.648 ms
 latency stddev: 2.518 ms
 rate limit schedule lag: avg 5.598 (max 9.861) ms
 tps = 0.180517 (including connections establishing)
 tps = 0.180566 (excluding connections establishing)

 real    0m11.158s
           ^^ 11 seconds, not 5.
 user    0m0.041s
 sys     0m0.013s

I mean, sure, in theory it's wasteful for the thread to sit around doing
nothing waiting for the timer to expire, but it's not evident to me that hurts
anything, really.


I compute stats on the output, including the progress report, to check for
responsiveness (eg it is not stuck somewhere because of a checkpoint which
triggered some IO storm or whatever), for that purpose it is better for the
trace to be there as expected.

>  - when there is a progress (-P) report, ensure that all progress
>    reports are shown even if no more transactions are schedule.

That's pretty ugly - it would be easy for the test at the top of the
loop to be left out of sync with the similar test inside the loop by
some future patch.


Hmmm. Cannot help it.

A cleaner fix would be to have the main thread do only the progress and periodic
stats aggregation, while other threads would do actual transactions, however
that would break pgbench working without threads, so I think this is a no go.

And again, I wonder why this is really a bug.


Well, if you are fine to set "-T 5 -P 1" and the run not to last 5 seconds nor
print any report every second, then it is not a bug:

 sh> time ./pgbench -T 5 -P 1 -R 0.1 -c 2 -j 2
 starting vacuum...end.
 < UNLUCKY, NO PROGRESS REPORT AT ALL...>
 transaction type: <builtin: TPC-B (sort of)>
 scaling factor: 1
 query mode: simple
 number of clients: 2
 number of threads: 2
 duration: 5 s
 number of transactions actually processed: 1
 latency average = 16.198 ms
 latency stddev = 0.000 ms
 rate limit schedule lag: avg 4.308 (max 4.308) ms
 tps = 0.236361 (including connections establishing)
 tps = 0.236771 (excluding connections establishing)

 real    0m4.256s

--
Fabien.

diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 7eb6a2d..08d358f 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -1285,6 +1285,12 @@ top:
 		thread->throttle_trigger += wait;
 		st->txn_scheduled = thread->throttle_trigger;
 
+		/* stop client if next transaction is beyond pgbench end of execution */
+		if (duration &&
+			st->txn_scheduled >
+			INSTR_TIME_GET_MICROSEC(thread->start_time) + (int64) 1000000 * duration)
+			return false;
+
 		/*
 		 * If this --latency-limit is used, and this slot is already late so
 		 * that the transaction will miss the latency limit even if it
@@ -3640,7 +3646,10 @@ threadRun(void *arg)
 		}
 	}
 
-	while (remains > 0)
+	while (remains > 0 ||
+		   /* thread zero keeps on printing progress report if any */
+		   (progress && thread->tid == 0 && duration &&
+			next_report <= thread_start + (int64) duration * 1000000))
 	{
 		fd_set		input_mask;
 		int			maxsock;	/* max socket number to be waited */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench small bug fix

Reply via email to