Hello Heikki,

I find the definition of the latency limit a bit strange. It's a limit on how late a transaction can *start* compared to it's scheduled starting time, not how long a query is allowed to last.

Yes. This is what can be done easily with pgbench under throttling. Note that if transactions take long it is recorded (average, stddev...) so it appears elsewhere.

How do figure out what it should be set to?

It is really just a simple tool to measure unresponsiveness under throttling, which I'm testing and complaining about in another thread.

The underlying model I have in mind would some timeout from an application, say a web server, or a pooling process which is handling a queue of requests...

Now, if I describe that as "lag limit" instead if "latency limit", maybe
it is clearer and better?

That model might make some sense if you think e.g. of a web application,
[...]

Yep, that is what I had in mind, but the primary objective is really to check whether pg is responsive or not.

So I think a more useful model is that new queries arrive at a given rate, and each query is expected to finish in X milliseconds from its arrival time (i.e the time the query is scheduled to begin, not the time it was sent to the server) or it's counted as failed. If a transaction cannot even be started by that deadline, because the connection is still busy with the previous query, it's counted as failed without even sending it to the server. With that definition, it makes sense to specify the latency limit even without --rate.

Yep. But that is not what I'm doing here. It would be interesting as well. It would be another patch.

In that case, it's simply a limit on how long each query's execution is allowed to last until it's considered as failed. IOW, each query's scheduled start time is when the previous query ends.

Not under --rate... that is the point of throttling! Under throttling, the latency should really be computed wrt to the schedule start time and not the actual start time which may be 10 seconds afterwards when things are going bad... Also, there is the question of whether the "failed query" is executed or not. Here I'm not executing them, in effect they were aborted by the application. With your suggestion they would be executed anyway but considered failed.

So what you are suggesting is another (interesting) functionnality, that could indeed be named "latency limit" (count slow above a threshold queries), what I'm doing here is "lag limit" (scheduled query could not start on time and are skipped, this is really specific to --rate).

In the updated patch attached, I changed the explanations, documentation and name to "lag limit" instead of "latency limit" to clarify this point. It was really a misnommer.

--
Fabien.
diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index 2f7d80e..d7ee1c9 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -141,6 +141,13 @@ double		sample_rate = 0.0;
 int64		throttle_delay = 0;
 
 /*
+ * When under throttling, execution time slots which are more than
+ * this late (in us) are skipped, and the corresponding transaction
+ * will be counted as somehow aborted.
+ */
+int64		throttle_lag_limit = 0;
+
+/*
  * tablespace selection
  */
 char	   *tablespace = NULL;
@@ -238,6 +245,7 @@ typedef struct
 	int64		throttle_trigger;		/* previous/next throttling (us) */
 	int64		throttle_lag;	/* total transaction lag behind throttling */
 	int64		throttle_lag_max;		/* max transaction lag */
+	int64		throttle_lag_skipped; /* lagging transactions skipped */
 } TState;
 
 #define INVALID_THREAD		((pthread_t) 0)
@@ -250,6 +258,7 @@ typedef struct
 	int64		sqlats;
 	int64		throttle_lag;
 	int64		throttle_lag_max;
+	int64		throttle_lag_skipped;
 } TResult;
 
 /*
@@ -367,6 +376,9 @@ usage(void)
 		 "  -f, --file=FILENAME      read transaction script from FILENAME\n"
 		   "  -j, --jobs=NUM           number of threads (default: 1)\n"
 		   "  -l, --log                write transaction times to log file\n"
+		   "  -L, --lag-limit=NUM      under throttling (--rate), skip transactions that\n"
+		   "                           are lagging more than NUM ms behind schedule\n"
+		   "                           (default: do not skip)\n"
 		   "  -M, --protocol=simple|extended|prepared\n"
 		   "                           protocol for submitting queries (default: simple)\n"
 		   "  -n, --no-vacuum          do not run VACUUM before tests\n"
@@ -1016,6 +1028,24 @@ top:
 
 		thread->throttle_trigger += wait;
 
+		if (throttle_lag_limit)
+		{
+			instr_time	now;
+			int64		now_us;
+			INSTR_TIME_SET_CURRENT(now);
+			now_us = INSTR_TIME_GET_MICROSEC(now);
+			while (thread->throttle_trigger < now_us - throttle_lag_limit)
+			{
+				/* if too far behind, this slot is skipped, and we
+				 * iterate till the next nearly on time slot.
+				 */
+				int64 wait = (int64) (throttle_delay *
+					1.00055271703 * -log(getrand(thread, 1, 10000) / 10000.0));
+				thread->throttle_trigger += wait;
+				thread->throttle_lag_skipped ++;
+			}
+		}
+
 		st->until = thread->throttle_trigger;
 		st->sleeping = 1;
 		st->throttling = true;
@@ -2351,7 +2381,8 @@ printResults(int ttype, int64 normal_xacts, int nclients,
 			 TState *threads, int nthreads,
 			 instr_time total_time, instr_time conn_total_time,
 			 int64 total_latencies, int64 total_sqlats,
-			 int64 throttle_lag, int64 throttle_lag_max)
+			 int64 throttle_lag, int64 throttle_lag_max,
+			 int64 throttle_lag_skipped)
 {
 	double		time_include,
 				tps_include,
@@ -2417,6 +2448,10 @@ printResults(int ttype, int64 normal_xacts, int nclients,
 		 */
 		printf("rate limit schedule lag: avg %.3f (max %.3f) ms\n",
 			   0.001 * throttle_lag / normal_xacts, 0.001 * throttle_lag_max);
+		if (throttle_lag_limit)
+			printf("number of skipped lag transactions: " INT64_FORMAT " (%.3f %%)\n",
+				   throttle_lag_skipped,
+				   100.0 * throttle_lag_skipped / (throttle_lag_skipped + normal_xacts));
 	}
 
 	printf("tps = %f (including connections establishing)\n", tps_include);
@@ -2505,6 +2540,7 @@ main(int argc, char **argv)
 		{"sampling-rate", required_argument, NULL, 4},
 		{"aggregate-interval", required_argument, NULL, 5},
 		{"rate", required_argument, NULL, 'R'},
+		{"lag-limit", required_argument, NULL, 'L'},
 		{NULL, 0, NULL, 0}
 	};
 
@@ -2534,6 +2570,7 @@ main(int argc, char **argv)
 	int64		total_sqlats = 0;
 	int64		throttle_lag = 0;
 	int64		throttle_lag_max = 0;
+	int64		throttle_lag_skipped = 0;
 
 	int			i;
 
@@ -2578,7 +2615,7 @@ main(int argc, char **argv)
 	state = (CState *) pg_malloc(sizeof(CState));
 	memset(state, 0, sizeof(CState));
 
-	while ((c = getopt_long(argc, argv, "ih:nvp:dqSNc:j:Crs:t:T:U:lf:D:F:M:P:R:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "ih:nvp:dqSNc:j:Crs:t:T:U:lf:D:F:M:P:R:L:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -2775,6 +2812,18 @@ main(int argc, char **argv)
 					throttle_delay = (int64) (1000000.0 / throttle_value);
 				}
 				break;
+			case 'L':
+				{
+					double limit_ms = atof(optarg);
+					if (limit_ms <= 0.0)
+					{
+						fprintf(stderr, "invalid lag limit: %s\n", optarg);
+						exit(1);
+					}
+					benchmarking_option_set = true;
+					throttle_lag_limit = (int64) (limit_ms * 1000);
+				}
+				break;
 			case 0:
 				/* This covers long options which take no argument. */
 				if (foreign_keys || unlogged_tables)
@@ -2895,6 +2944,12 @@ main(int argc, char **argv)
 		exit(1);
 	}
 
+	if (throttle_lag_limit != 0 && throttle_delay == 0)
+	{
+		fprintf(stderr, "lag limit (-L) can only be specified with throttling (-R)\n");
+		exit(1);
+	}
+
 	/*
 	 * is_latencies only works with multiple threads in thread-based
 	 * implementations, not fork-based ones, because it supposes that the
@@ -3070,6 +3125,7 @@ main(int argc, char **argv)
 		thread->random_state[0] = random();
 		thread->random_state[1] = random();
 		thread->random_state[2] = random();
+		thread->throttle_lag_skipped = 0;
 
 		if (is_latencies)
 		{
@@ -3144,6 +3200,7 @@ main(int argc, char **argv)
 			total_latencies += r->latencies;
 			total_sqlats += r->sqlats;
 			throttle_lag += r->throttle_lag;
+			throttle_lag_skipped += r->throttle_lag_skipped;
 			if (r->throttle_lag_max > throttle_lag_max)
 				throttle_lag_max = r->throttle_lag_max;
 			INSTR_TIME_ADD(conn_total_time, r->conn_time);
@@ -3166,7 +3223,7 @@ main(int argc, char **argv)
 	INSTR_TIME_SUBTRACT(total_time, start_time);
 	printResults(ttype, total_xacts, nclients, threads, nthreads,
 				 total_time, conn_total_time, total_latencies, total_sqlats,
-				 throttle_lag, throttle_lag_max);
+				 throttle_lag, throttle_lag_max, throttle_lag_skipped);
 
 	return 0;
 }
@@ -3191,7 +3248,8 @@ threadRun(void *arg)
 	int64		last_count = 0,
 				last_lats = 0,
 				last_sqlats = 0,
-				last_lags = 0;
+				last_lags = 0,
+				last_skipped = 0;
 
 	AggVals		aggs;
 
@@ -3394,7 +3452,8 @@ threadRun(void *arg)
 				/* generate and show report */
 				int64		count = 0,
 							lats = 0,
-							sqlats = 0;
+							sqlats = 0,
+							skipped = 0;
 				int64		lags = thread->throttle_lag;
 				int64		run = now - last_report;
 				double		tps,
@@ -3417,23 +3476,24 @@ threadRun(void *arg)
 				sqlat = 1.0 * (sqlats - last_sqlats) / (count - last_count);
 				stdev = 0.001 * sqrt(sqlat - 1000000.0 * latency * latency);
 				lag = 0.001 * (lags - last_lags) / (count - last_count);
+				skipped = thread->throttle_lag_skipped - last_skipped;
 
+				fprintf(stderr,
+						"progress %d: %.1f s, %.1f tps, "
+						"lat %.3f ms stddev %.3f",
+						thread->tid, total_run, tps, latency, stdev);
 				if (throttle_delay)
-					fprintf(stderr,
-							"progress %d: %.1f s, %.1f tps, "
-							"lat %.3f ms stddev %.3f, lag %.3f ms\n",
-							thread->tid, total_run, tps, latency, stdev, lag);
-				else
-					fprintf(stderr,
-							"progress %d: %.1f s, %.1f tps, "
-							"lat %.3f ms stddev %.3f\n",
-							thread->tid, total_run, tps, latency, stdev);
+					fprintf(stderr, ", lag %.3f ms", lag);
+				if (throttle_lag_limit)
+					fprintf(stderr, ", " INT64_FORMAT " skipped", skipped);
+				fprintf(stderr, "\n");
 
 				last_count = count;
 				last_lats = lats;
 				last_sqlats = sqlats;
 				last_lags = lags;
 				last_report = now;
+				last_skipped = thread->throttle_lag_skipped;
 				next_report += (int64) progress *1000000;
 			}
 		}
@@ -3452,7 +3512,8 @@ threadRun(void *arg)
 				int64		count = 0,
 							lats = 0,
 							sqlats = 0,
-							lags = 0;
+							lags = 0,
+							skipped = 0;
 				int64		run = now - last_report;
 				double		tps,
 							total_run,
@@ -3477,23 +3538,24 @@ threadRun(void *arg)
 				sqlat = 1.0 * (sqlats - last_sqlats) / (count - last_count);
 				stdev = 0.001 * sqrt(sqlat - 1000000.0 * latency * latency);
 				lag = 0.001 * (lags - last_lags) / (count - last_count);
+				skipped = thread->throttle_lag_skipped - last_skipped;
 
+				fprintf(stderr,
+						"progress: %.1f s, %.1f tps, "
+						"lat %.3f ms stddev %.3f",
+						total_run, tps, latency, stdev);
 				if (throttle_delay)
-					fprintf(stderr,
-							"progress: %.1f s, %.1f tps, "
-							"lat %.3f ms stddev %.3f, lag %.3f ms\n",
-							total_run, tps, latency, stdev, lag);
-				else
-					fprintf(stderr,
-							"progress: %.1f s, %.1f tps, "
-							"lat %.3f ms stddev %.3f\n",
-							total_run, tps, latency, stdev);
+					fprintf(stderr, ", lag %.3f ms", lag);
+				if (throttle_lag_limit)
+					fprintf(stderr, ", " INT64_FORMAT " skipped", skipped);
+				fprintf(stderr, "\n");
 
 				last_count = count;
 				last_lats = lats;
 				last_sqlats = sqlats;
 				last_lags = lags;
 				last_report = now;
+				last_skipped = thread->throttle_lag_skipped;
 				next_report += (int64) progress *1000000;
 			}
 		}
@@ -3514,6 +3576,7 @@ done:
 	}
 	result->throttle_lag = thread->throttle_lag;
 	result->throttle_lag_max = thread->throttle_lag_max;
+	result->throttle_lag_skipped = thread->throttle_lag_skipped;
 	INSTR_TIME_SET_CURRENT(end);
 	INSTR_TIME_ACCUM_DIFF(result->conn_time, end, start);
 	if (logfile)
diff --git a/doc/src/sgml/pgbench.sgml b/doc/src/sgml/pgbench.sgml
index 23bfa9e..3c637bc 100644
--- a/doc/src/sgml/pgbench.sgml
+++ b/doc/src/sgml/pgbench.sgml
@@ -345,6 +345,18 @@ pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</>
      </varlistentry>
 
      <varlistentry>
+      <term><option>-L</option> <replaceable>limit</></term>
+      <term><option>--lag-limit=</option><replaceable>limit</></term>
+      <listitem>
+       <para>
+        Under throttling option <option>--rate=...</>, transactions lagging
+        behind schedule by this number of ms will be skipped and reported as
+        such.
+       </para>
+       </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><option>-M</option> <replaceable>querymode</></term>
       <term><option>--protocol=</option><replaceable>querymode</></term>
       <listitem>
@@ -437,7 +449,9 @@ pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</>
         schedule moves forward based on when the client first started, not
         when the previous transaction ended.  That approach means that when
         transactions go past their original scheduled end time, it is
-        possible for later ones to catch up again.
+        possible for later ones to catch up again. It can be changed with
+        option <option>--lag-limit</> which skips and counts transactions
+        late by more than this delay.
        </para>
        <para>
         When throttling is active, the average and maximum transaction
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to