Re: [HACKERS] pgbench throttling latency limit

Fabien COELHO Wed, 27 Aug 2014 02:43:05 -0700


Hello Heikki,

I find the definition of the latency limit a bit strange. It's a limit on howlate a transaction can *start* compared to it's scheduled starting time, nothow long a query is allowed to last.

Yes. This is what can be done easily with pgbench under throttling. Notethat if transactions take long it is recorded (average, stddev...) so itappears elsewhere.

How do figure out what it should be set to?

It is really just a simple tool to measure unresponsiveness underthrottling, which I'm testing and complaining about in another thread.

The underlying model I have in mind would some timeout from anapplication, say a web server, or a pooling process which is handling aqueue of requests...


Now, if I describe that as "lag limit" instead if "latency limit", maybe
it is clearer and better?

That model might make some sense if you think e.g. of a web application,
[...]

Yep, that is what I had in mind, but the primary objective is really tocheck whether pg is responsive or not.

So I think a more useful model is that new queries arrive at a givenrate, and each query is expected to finish in X milliseconds from itsarrival time (i.e the time the query is scheduled to begin, not the timeit was sent to the server) or it's counted as failed. If a transactioncannot even be started by that deadline, because the connection is stillbusy with the previous query, it's counted as failed without evensending it to the server. With that definition, it makes sense tospecify the latency limit even without --rate.

Yep. But that is not what I'm doing here. It would be interesting as well.It would be another patch.

In that case, it's simply a limit on how long each query'sexecution is allowed to last until it's considered as failed. IOW, eachquery's scheduled start time is when the previous query ends.

Not under --rate... that is the point of throttling! Under throttling,the latency should really be computed wrt to the schedule start time andnot the actual start time which may be 10 seconds afterwards when thingsare going bad... Also, there is the question of whether the "failed query"is executed or not. Here I'm not executing them, in effect they wereaborted by the application. With your suggestion they would be executedanyway but considered failed.

So what you are suggesting is another (interesting) functionnality, thatcould indeed be named "latency limit" (count slow above a thresholdqueries), what I'm doing here is "lag limit" (scheduled query could notstart on time and are skipped, this is really specific to --rate).

In the updated patch attached, I changed the explanations, documentationand name to "lag limit" instead of "latency limit" to clarify this point.It was really a misnommer.


--
Fabien.

diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index 2f7d80e..d7ee1c9 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -141,6 +141,13 @@ double		sample_rate = 0.0;
 int64		throttle_delay = 0;
 
 /*
+ * When under throttling, execution time slots which are more than
+ * this late (in us) are skipped, and the corresponding transaction
+ * will be counted as somehow aborted.
+ */
+int64		throttle_lag_limit = 0;
+
+/*
  * tablespace selection
  */
 char	   *tablespace = NULL;
@@ -238,6 +245,7 @@ typedef struct
 	int64		throttle_trigger;		/* previous/next throttling (us) */
 	int64		throttle_lag;	/* total transaction lag behind throttling */
 	int64		throttle_lag_max;		/* max transaction lag */
+	int64		throttle_lag_skipped; /* lagging transactions skipped */
 } TState;
 
 #define INVALID_THREAD		((pthread_t) 0)
@@ -250,6 +258,7 @@ typedef struct
 	int64		sqlats;
 	int64		throttle_lag;
 	int64		throttle_lag_max;
+	int64		throttle_lag_skipped;
 } TResult;
 
 /*
@@ -367,6 +376,9 @@ usage(void)
 		 "  -f, --file=FILENAME      read transaction script from FILENAME\n"
 		   "  -j, --jobs=NUM           number of threads (default: 1)\n"
 		   "  -l, --log                write transaction times to log file\n"
+		   "  -L, --lag-limit=NUM      under throttling (--rate), skip transactions that\n"
+		   "                           are lagging more than NUM ms behind schedule\n"
+		   "                           (default: do not skip)\n"
 		   "  -M, --protocol=simple|extended|prepared\n"
 		   "                           protocol for submitting queries (default: simple)\n"
 		   "  -n, --no-vacuum          do not run VACUUM before tests\n"
@@ -1016,6 +1028,24 @@ top:
 
 		thread->throttle_trigger += wait;
 
+		if (throttle_lag_limit)
+		{
+			instr_time	now;
+			int64		now_us;
+			INSTR_TIME_SET_CURRENT(now);
+			now_us = INSTR_TIME_GET_MICROSEC(now);
+			while (thread->throttle_trigger < now_us - throttle_lag_limit)
+			{
+				/* if too far behind, this slot is skipped, and we
+				 * iterate till the next nearly on time slot.
+				 */
+				int64 wait = (int64) (throttle_delay *
+					1.00055271703 * -log(getrand(thread, 1, 10000) / 10000.0));
+				thread->throttle_trigger += wait;
+				thread->throttle_lag_skipped ++;
+			}
+		}
+
 		st->until = thread->throttle_trigger;
 		st->sleeping = 1;
 		st->throttling = true;
@@ -2351,7 +2381,8 @@ printResults(int ttype, int64 normal_xacts, int nclients,
 			 TState *threads, int nthreads,
 			 instr_time total_time, instr_time conn_total_time,
 			 int64 total_latencies, int64 total_sqlats,
-			 int64 throttle_lag, int64 throttle_lag_max)
+			 int64 throttle_lag, int64 throttle_lag_max,
+			 int64 throttle_lag_skipped)
 {
 	double		time_include,
 				tps_include,
@@ -2417,6 +2448,10 @@ printResults(int ttype, int64 normal_xacts, int nclients,
 		 */
 		printf("rate limit schedule lag: avg %.3f (max %.3f) ms\n",
 			   0.001 * throttle_lag / normal_xacts, 0.001 * throttle_lag_max);
+		if (throttle_lag_limit)
+			printf("number of skipped lag transactions: " INT64_FORMAT " (%.3f %%)\n",
+				   throttle_lag_skipped,
+				   100.0 * throttle_lag_skipped / (throttle_lag_skipped + normal_xacts));
 	}
 
 	printf("tps = %f (including connections establishing)\n", tps_include);
@@ -2505,6 +2540,7 @@ main(int argc, char **argv)
 		{"sampling-rate", required_argument, NULL, 4},
 		{"aggregate-interval", required_argument, NULL, 5},
 		{"rate", required_argument, NULL, 'R'},
+		{"lag-limit", required_argument, NULL, 'L'},
 		{NULL, 0, NULL, 0}
 	};
 
@@ -2534,6 +2570,7 @@ main(int argc, char **argv)
 	int64		total_sqlats = 0;
 	int64		throttle_lag = 0;
 	int64		throttle_lag_max = 0;
+	int64		throttle_lag_skipped = 0;
 
 	int			i;
 
@@ -2578,7 +2615,7 @@ main(int argc, char **argv)
 	state = (CState *) pg_malloc(sizeof(CState));
 	memset(state, 0, sizeof(CState));
 
-	while ((c = getopt_long(argc, argv, "ih:nvp:dqSNc:j:Crs:t:T:U:lf:D:F:M:P:R:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "ih:nvp:dqSNc:j:Crs:t:T:U:lf:D:F:M:P:R:L:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -2775,6 +2812,18 @@ main(int argc, char **argv)
 					throttle_delay = (int64) (1000000.0 / throttle_value);
 				}
 				break;
+			case 'L':
+				{
+					double limit_ms = atof(optarg);
+					if (limit_ms <= 0.0)
+					{
+						fprintf(stderr, "invalid lag limit: %s\n", optarg);
+						exit(1);
+					}
+					benchmarking_option_set = true;
+					throttle_lag_limit = (int64) (limit_ms * 1000);
+				}
+				break;
 			case 0:
 				/* This covers long options which take no argument. */
 				if (foreign_keys || unlogged_tables)
@@ -2895,6 +2944,12 @@ main(int argc, char **argv)
 		exit(1);
 	}
 
+	if (throttle_lag_limit != 0 && throttle_delay == 0)
+	{
+		fprintf(stderr, "lag limit (-L) can only be specified with throttling (-R)\n");
+		exit(1);
+	}
+
 	/*
 	 * is_latencies only works with multiple threads in thread-based
 	 * implementations, not fork-based ones, because it supposes that the
@@ -3070,6 +3125,7 @@ main(int argc, char **argv)
 		thread->random_state[0] = random();
 		thread->random_state[1] = random();
 		thread->random_state[2] = random();
+		thread->throttle_lag_skipped = 0;
 
 		if (is_latencies)
 		{
@@ -3144,6 +3200,7 @@ main(int argc, char **argv)
 			total_latencies += r->latencies;
 			total_sqlats += r->sqlats;
 			throttle_lag += r->throttle_lag;
+			throttle_lag_skipped += r->throttle_lag_skipped;
 			if (r->throttle_lag_max > throttle_lag_max)
 				throttle_lag_max = r->throttle_lag_max;
 			INSTR_TIME_ADD(conn_total_time, r->conn_time);
@@ -3166,7 +3223,7 @@ main(int argc, char **argv)
 	INSTR_TIME_SUBTRACT(total_time, start_time);
 	printResults(ttype, total_xacts, nclients, threads, nthreads,
 				 total_time, conn_total_time, total_latencies, total_sqlats,
-				 throttle_lag, throttle_lag_max);
+				 throttle_lag, throttle_lag_max, throttle_lag_skipped);
 
 	return 0;
 }
@@ -3191,7 +3248,8 @@ threadRun(void *arg)
 	int64		last_count = 0,
 				last_lats = 0,
 				last_sqlats = 0,
-				last_lags = 0;
+				last_lags = 0,
+				last_skipped = 0;
 
 	AggVals		aggs;
 
@@ -3394,7 +3452,8 @@ threadRun(void *arg)
 				/* generate and show report */
 				int64		count = 0,
 							lats = 0,
-							sqlats = 0;
+							sqlats = 0,
+							skipped = 0;
 				int64		lags = thread->throttle_lag;
 				int64		run = now - last_report;
 				double		tps,
@@ -3417,23 +3476,24 @@ threadRun(void *arg)
 				sqlat = 1.0 * (sqlats - last_sqlats) / (count - last_count);
 				stdev = 0.001 * sqrt(sqlat - 1000000.0 * latency * latency);
 				lag = 0.001 * (lags - last_lags) / (count - last_count);
+				skipped = thread->throttle_lag_skipped - last_skipped;
 
+				fprintf(stderr,
+						"progress %d: %.1f s, %.1f tps, "
+						"lat %.3f ms stddev %.3f",
+						thread->tid, total_run, tps, latency, stdev);
 				if (throttle_delay)
-					fprintf(stderr,
-							"progress %d: %.1f s, %.1f tps, "
-							"lat %.3f ms stddev %.3f, lag %.3f ms\n",
-							thread->tid, total_run, tps, latency, stdev, lag);
-				else
-					fprintf(stderr,
-							"progress %d: %.1f s, %.1f tps, "
-							"lat %.3f ms stddev %.3f\n",
-							thread->tid, total_run, tps, latency, stdev);
+					fprintf(stderr, ", lag %.3f ms", lag);
+				if (throttle_lag_limit)
+					fprintf(stderr, ", " INT64_FORMAT " skipped", skipped);
+				fprintf(stderr, "\n");
 
 				last_count = count;
 				last_lats = lats;
 				last_sqlats = sqlats;
 				last_lags = lags;
 				last_report = now;
+				last_skipped = thread->throttle_lag_skipped;
 				next_report += (int64) progress *1000000;
 			}
 		}
@@ -3452,7 +3512,8 @@ threadRun(void *arg)
 				int64		count = 0,
 							lats = 0,
 							sqlats = 0,
-							lags = 0;
+							lags = 0,
+							skipped = 0;
 				int64		run = now - last_report;
 				double		tps,
 							total_run,
@@ -3477,23 +3538,24 @@ threadRun(void *arg)
 				sqlat = 1.0 * (sqlats - last_sqlats) / (count - last_count);
 				stdev = 0.001 * sqrt(sqlat - 1000000.0 * latency * latency);
 				lag = 0.001 * (lags - last_lags) / (count - last_count);
+				skipped = thread->throttle_lag_skipped - last_skipped;
 
+				fprintf(stderr,
+						"progress: %.1f s, %.1f tps, "
+						"lat %.3f ms stddev %.3f",
+						total_run, tps, latency, stdev);
 				if (throttle_delay)
-					fprintf(stderr,
-							"progress: %.1f s, %.1f tps, "
-							"lat %.3f ms stddev %.3f, lag %.3f ms\n",
-							total_run, tps, latency, stdev, lag);
-				else
-					fprintf(stderr,
-							"progress: %.1f s, %.1f tps, "
-							"lat %.3f ms stddev %.3f\n",
-							total_run, tps, latency, stdev);
+					fprintf(stderr, ", lag %.3f ms", lag);
+				if (throttle_lag_limit)
+					fprintf(stderr, ", " INT64_FORMAT " skipped", skipped);
+				fprintf(stderr, "\n");
 
 				last_count = count;
 				last_lats = lats;
 				last_sqlats = sqlats;
 				last_lags = lags;
 				last_report = now;
+				last_skipped = thread->throttle_lag_skipped;
 				next_report += (int64) progress *1000000;
 			}
 		}
@@ -3514,6 +3576,7 @@ done:
 	}
 	result->throttle_lag = thread->throttle_lag;
 	result->throttle_lag_max = thread->throttle_lag_max;
+	result->throttle_lag_skipped = thread->throttle_lag_skipped;
 	INSTR_TIME_SET_CURRENT(end);
 	INSTR_TIME_ACCUM_DIFF(result->conn_time, end, start);
 	if (logfile)
diff --git a/doc/src/sgml/pgbench.sgml b/doc/src/sgml/pgbench.sgml
index 23bfa9e..3c637bc 100644
--- a/doc/src/sgml/pgbench.sgml
+++ b/doc/src/sgml/pgbench.sgml
@@ -345,6 +345,18 @@ pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</>
      </varlistentry>
 
      <varlistentry>
+      <term><option>-L</option> <replaceable>limit</></term>
+      <term><option>--lag-limit=</option><replaceable>limit</></term>
+      <listitem>
+       <para>
+        Under throttling option <option>--rate=...</>, transactions lagging
+        behind schedule by this number of ms will be skipped and reported as
+        such.
+       </para>
+       </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><option>-M</option> <replaceable>querymode</></term>
       <term><option>--protocol=</option><replaceable>querymode</></term>
       <listitem>
@@ -437,7 +449,9 @@ pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</>
         schedule moves forward based on when the client first started, not
         when the previous transaction ended.  That approach means that when
         transactions go past their original scheduled end time, it is
-        possible for later ones to catch up again.
+        possible for later ones to catch up again. It can be changed with
+        option <option>--lag-limit</> which skips and counts transactions
+        late by more than this delay.
        </para>
        <para>
         When throttling is active, the average and maximum transaction

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench throttling latency limit

Reply via email to