Hello Mitsumasa-san,

And I'm also interested in your "decile percents" output like under
followings,
decile percents: 39.6% 24.0% 14.6% 8.8% 5.4% 3.3% 2.0% 1.2% 0.7% 0.4%

Sure, I'm really fine with that.

I think that it is easier than before. Sum of decile percents is just 100%.

That's a good property:-)

However, I don't prefer "highest/lowest percentage" because it will be confused with decile percentage for users, and anyone cannot understand this digits. I cannot understand "4.9%, 0.0%" when I see the first time. Then, I checked the source code, I understood it:( It's not good design... #Why this parameter use 100?

What else? People have ten fingers and like powers of 10, and are used to percents?

So I'd like to remove it if you like. It will be more simple.

I think that for the exponential distribution it helps, especially for high threshold, to have the lowest/highest percent density. For low thresholds, the decile is also definitely useful. So I'm fine with both outputs as you have put them.

I have just updated the wording so that it may be clearer:

 decile percents: 69.9% 21.0% 6.3% 1.9% 0.6% 0.2% 0.1% 0.0% 0.0% 0.0%
 probability of fist/last percent of the range: 11.3% 0.0%

Attached patch is fixed version, please confirm it.

Attached a v15 which just fixes a typo and the above wording update. I'm validating it for committers.

#Of course, World Cup is being held now. I'm not hurry at all.

I'm not a soccer kind of person, so it does not influence my availibility.:-)


Suggested commit message:

Add drawing random integers with a Gaussian or truncated exponentional distributions to pgbench.

Test variants with these distributions are also provided and triggered
with options "--gaussian=..." and "--exponential=...".


Have a nice day/night,

--
Fabien.
diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index 4aa8a50..3541b7e 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -41,6 +41,7 @@
 #include <math.h>
 #include <signal.h>
 #include <sys/time.h>
+#include <assert.h>
 #ifdef HAVE_SYS_SELECT_H
 #include <sys/select.h>
 #endif
@@ -98,6 +99,8 @@ static int	pthread_join(pthread_t th, void **thread_return);
 #define LOG_STEP_SECONDS	5	/* seconds between log messages */
 #define DEFAULT_NXACTS	10		/* default nxacts */
 
+#define MIN_GAUSSIAN_THRESHOLD		2.0	/* minimum threshold for gauss */
+
 int			nxacts = 0;			/* number of transactions per client */
 int			duration = 0;		/* duration in seconds */
 
@@ -171,6 +174,14 @@ bool		is_connect;			/* establish connection for each transaction */
 bool		is_latencies;		/* report per-command latencies */
 int			main_pid;			/* main process id used in log filename */
 
+/* gaussian distribution tests: */
+double		stdev_threshold;   /* standard deviation threshold */
+bool        use_gaussian = false;
+
+/* exponential distribution tests: */
+double		exp_threshold;   /* threshold for exponential */
+bool		use_exponential = false;
+
 char	   *pghost = "";
 char	   *pgport = "";
 char	   *login = NULL;
@@ -332,6 +343,88 @@ static char *select_only = {
 	"SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n"
 };
 
+/* --exponential case */
+static char *exponential_tpc_b = {
+	"\\set nbranches " CppAsString2(nbranches) " * :scale\n"
+	"\\set ntellers " CppAsString2(ntellers) " * :scale\n"
+	"\\set naccounts " CppAsString2(naccounts) " * :scale\n"
+	"\\setrandom aid 1 :naccounts exponential :exp_threshold\n"
+	"\\setrandom bid 1 :nbranches\n"
+	"\\setrandom tid 1 :ntellers\n"
+	"\\setrandom delta -5000 5000\n"
+	"BEGIN;\n"
+	"UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;\n"
+	"SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n"
+	"UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;\n"
+	"UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;\n"
+	"INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);\n"
+	"END;\n"
+};
+
+/* --exponential with -N case */
+static char *exponential_simple_update = {
+	"\\set nbranches " CppAsString2(nbranches) " * :scale\n"
+	"\\set ntellers " CppAsString2(ntellers) " * :scale\n"
+	"\\set naccounts " CppAsString2(naccounts) " * :scale\n"
+	"\\setrandom aid 1 :naccounts exponential :exp_threshold\n"
+	"\\setrandom bid 1 :nbranches\n"
+	"\\setrandom tid 1 :ntellers\n"
+	"\\setrandom delta -5000 5000\n"
+	"BEGIN;\n"
+	"UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;\n"
+	"SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n"
+	"INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);\n"
+	"END;\n"
+};
+
+/* --exponential with -S case */
+static char *exponential_select_only = {
+	"\\set naccounts " CppAsString2(naccounts) " * :scale\n"
+	"\\setrandom aid 1 :naccounts exponential :exp_threshold\n"
+	"SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n"
+};
+
+/* --gaussian case */
+static char *gaussian_tpc_b = {
+	"\\set nbranches " CppAsString2(nbranches) " * :scale\n"
+	"\\set ntellers " CppAsString2(ntellers) " * :scale\n"
+	"\\set naccounts " CppAsString2(naccounts) " * :scale\n"
+	"\\setrandom aid 1 :naccounts gaussian :stdev_threshold\n"
+	"\\setrandom bid 1 :nbranches\n"
+	"\\setrandom tid 1 :ntellers\n"
+	"\\setrandom delta -5000 5000\n"
+	"BEGIN;\n"
+	"UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;\n"
+	"SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n"
+	"UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;\n"
+	"UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;\n"
+	"INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);\n"
+	"END;\n"
+};
+
+/* --gaussian with -N case */
+static char *gaussian_simple_update = {
+	"\\set nbranches " CppAsString2(nbranches) " * :scale\n"
+	"\\set ntellers " CppAsString2(ntellers) " * :scale\n"
+	"\\set naccounts " CppAsString2(naccounts) " * :scale\n"
+	"\\setrandom aid 1 :naccounts gaussian :stdev_threshold\n"
+	"\\setrandom bid 1 :nbranches\n"
+	"\\setrandom tid 1 :ntellers\n"
+	"\\setrandom delta -5000 5000\n"
+	"BEGIN;\n"
+	"UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;\n"
+	"SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n"
+	"INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);\n"
+	"END;\n"
+};
+
+/* --gaussian with -S case */
+static char *gaussian_select_only = {
+	"\\set naccounts " CppAsString2(naccounts) " * :scale\n"
+	"\\setrandom aid 1 :naccounts gaussian :stdev_threshold\n"
+	"SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n"
+};
+
 /* Function prototypes */
 static void setalarm(int seconds);
 static void *threadRun(void *arg);
@@ -375,6 +468,8 @@ usage(void)
 		   "  -v, --vacuum-all         vacuum all four standard tables before tests\n"
 		   "  --aggregate-interval=NUM aggregate data over NUM seconds\n"
 		   "  --sampling-rate=NUM      fraction of transactions to log (e.g. 0.01 for 1%%)\n"
+		   "  --exponential=NUM        exponential distribution with NUM threshold parameter\n"
+		   "  --gaussian=NUM           gaussian distribution with NUM threshold parameter\n"
 		   "\nCommon options:\n"
 		   "  -d, --debug              print debugging output\n"
 	  "  -h, --host=HOSTNAME      database server host or socket directory\n"
@@ -471,6 +566,76 @@ getrand(TState *thread, int64 min, int64 max)
 	return min + (int64) ((max - min + 1) * pg_erand48(thread->random_state));
 }
 
+/*
+ * random number generator: exponential distribution from min to max inclusive.
+ * the threshold is so that the density of probability for the last cut-off max
+ * value is exp(-exp_threshold).
+ */
+static int64
+getExponentialrand(TState *thread, int64 min, int64 max, double exp_threshold)
+{
+	double cut, uniform, rand;
+	assert(exp_threshold > 0.0);
+	cut = exp(-exp_threshold);
+	/* erand in [0, 1), uniform in (0, 1] */
+	uniform = 1.0 - pg_erand48(thread->random_state);
+	/*
+	 * inner expresion in (cut, 1] (if exp_threshold > 0),
+	 * rand in [0, 1)
+	 */
+	assert((1.0 - cut) != 0.0);
+	rand = - log(cut + (1.0 - cut) * uniform) / exp_threshold;
+	/* return int64 random number within between min and max */
+	return min + (int64)((max - min + 1) * rand);
+}
+
+/* random number generator: gaussian distribution from min to max inclusive */
+static int64
+getGaussianrand(TState *thread, int64 min, int64 max, double stdev_threshold)
+{
+	double		stdev;
+	double		rand;
+
+	/*
+	 * Get user specified random number from this loop, with
+	 * -stdev_threshold < stdev <= stdev_threshold
+	 *
+	 * This loop is executed until the number is in the expected range.
+	 *
+	 * As the minimum threshold is 2.0, the probability of looping is low:
+	 * sqrt(-2 ln(r)) <= 2 => r >= e^{-2} ~ 0.135, then when taking the average
+	 * sinus multiplier as 2/pi, we have a 8.6% looping probability in the
+	 * worst case. For a 5.0 threshold value, the looping probability
+	 * is about e^{-5} * 2 / pi ~ 0.43%.
+	 */
+	do
+	{
+		/*
+		 * pg_erand48 generates [0,1), but for the basic version of the
+		 * Box-Muller transform the two uniformly distributed random numbers
+		 * are expected in (0, 1] (see http://en.wikipedia.org/wiki/Box_muller)
+		 */
+		double rand1 = 1.0 - pg_erand48(thread->random_state);
+		double rand2 = 1.0 - pg_erand48(thread->random_state);
+
+		/* Box-Muller basic form transform */
+		double var_sqrt = sqrt(-2.0 * log(rand1));
+		stdev = var_sqrt * sin(2.0 * M_PI * rand2);
+
+		/*
+ 		 * we may try with cos, but there may be a bias induced if the previous
+		 * value fails the test? To be on the safe side, let us try over.
+		 */
+	}
+	while (stdev < -stdev_threshold || stdev >= stdev_threshold);
+
+	/* stdev is in [-threshold, threshold), normalization to [0,1) */
+	rand = (stdev + stdev_threshold) / (stdev_threshold * 2.0);
+
+	/* return int64 random number within between min and max */
+	return min + (int64)((max - min + 1) * rand);
+}
+
 /* call PQexec() and exit() on failure */
 static void
 executeStatement(PGconn *con, const char *sql)
@@ -1319,6 +1484,7 @@ top:
 			char	   *var;
 			int64		min,
 						max;
+			double		threshold = 0;
 			char		res[64];
 
 			if (*argv[2] == ':')
@@ -1364,11 +1530,11 @@ top:
 			}
 
 			/*
-			 * getrand() needs to be able to subtract max from min and add one
-			 * to the result without overflowing.  Since we know max > min, we
-			 * can detect overflow just by checking for a negative result. But
-			 * we must check both that the subtraction doesn't overflow, and
-			 * that adding one to the result doesn't overflow either.
+			 * Generate random number functions need to be able to subtract
+			 * max from min and add one to the result without overflowing.
+			 * Since we know max > min, we can detect overflow just by checking
+			 * for a negative result. But we must check both that the subtraction
+			 * doesn't overflow, and that adding one to the result doesn't overflow either.
 			 */
 			if (max - min < 0 || (max - min) + 1 < 0)
 			{
@@ -1377,10 +1543,63 @@ top:
 				return true;
 			}
 
+			if (argc == 4) /* uniform */
+			{
+#ifdef DEBUG
+				printf("min: " INT64_FORMAT " max: " INT64_FORMAT " random: " INT64_FORMAT "\n", min, max, getrand(thread, min, max));
+#endif
+				snprintf(res, sizeof(res), INT64_FORMAT, getrand(thread, min, max));
+			}
+			else if ((pg_strcasecmp(argv[4], "gaussian") == 0) ||
+				 (pg_strcasecmp(argv[4], "exponential") == 0))
+			{
+				if (*argv[5] == ':')
+				{
+					if ((var = getVariable(st, argv[5] + 1)) == NULL)
+					{
+						fprintf(stderr, "%s: invalid threshold number %s\n", argv[0], argv[5]);
+						st->ecnt++;
+						return true;
+					}
+					threshold = strtod(var, NULL);
+				}
+				else
+					threshold = strtod(argv[5], NULL);
+
+				if (pg_strcasecmp(argv[4], "gaussian") == 0)
+				{
+					if (threshold < MIN_GAUSSIAN_THRESHOLD)
+					{
+						fprintf(stderr, "%s: gaussian threshold must be more than %f\n,", argv[5], MIN_GAUSSIAN_THRESHOLD);
+						st->ecnt++;
+						return true;
+					}
+#ifdef DEBUG
+					printf("min: " INT64_FORMAT " max: " INT64_FORMAT " random: " INT64_FORMAT "\n", min, max, getGaussianrand(thread, min, max, threshold));
+#endif
+					snprintf(res, sizeof(res), INT64_FORMAT, getGaussianrand(thread, min, max, threshold));
+				}
+				else if (pg_strcasecmp(argv[4], "exponential") == 0)
+				{
+					if (threshold <= 0.0)
+					{
+						fprintf(stderr, "%s: exponential threshold must be strictly positive\n,", argv[5]);
+						st->ecnt++;
+						return true;
+					}
 #ifdef DEBUG
-			printf("min: " INT64_FORMAT " max: " INT64_FORMAT " random: " INT64_FORMAT "\n", min, max, getrand(thread, min, max));
+					printf("min: " INT64_FORMAT " max: " INT64_FORMAT " random: " INT64_FORMAT "\n", min, max, getExponentialrand(thread, min, max, threshold));
 #endif
-			snprintf(res, sizeof(res), INT64_FORMAT, getrand(thread, min, max));
+					snprintf(res, sizeof(res), INT64_FORMAT, getExponentialrand(thread, min, max, threshold));
+				}
+			}
+			else /* uniform with extra arguments */
+			{
+#ifdef DEBUG
+				printf("min: " INT64_FORMAT " max: " INT64_FORMAT " random: " INT64_FORMAT "\n", min, max, getrand(thread, min, max));
+#endif
+				snprintf(res, sizeof(res), INT64_FORMAT, getrand(thread, min, max));
+			}
 
 			if (!putVariable(st, argv[0], argv[1], res))
 			{
@@ -1920,9 +2139,34 @@ process_commands(char *buf)
 				exit(1);
 			}
 
-			for (j = 4; j < my_commands->argc; j++)
-				fprintf(stderr, "%s: extra argument \"%s\" ignored\n",
-						my_commands->argv[0], my_commands->argv[j]);
+			if (my_commands->argc == 4 ) /* uniform */
+			{
+				/* nothing to do */
+			}
+			else if ((pg_strcasecmp(my_commands->argv[4], "gaussian") == 0) ||
+				 (pg_strcasecmp(my_commands->argv[4], "exponential") == 0))
+			{
+				if (my_commands->argc < 6)
+				{
+					fprintf(stderr, "%s(%s): missing argument\n", my_commands->argv[0], my_commands->argv[4]);
+					exit(1);
+				}
+
+				for (j = 6; j < my_commands->argc; j++)
+					fprintf(stderr, "%s(%s): extra argument \"%s\" ignored\n",
+							my_commands->argv[0], my_commands->argv[4], my_commands->argv[j]);
+			}
+			else /* uniform with extra argument */
+			{
+				int arg_pos = 4;
+
+				if (pg_strcasecmp(my_commands->argv[4], "uniform") == 0)
+					arg_pos++;
+
+				for (j = arg_pos; j < my_commands->argc; j++)
+					fprintf(stderr, "%s(uniform): extra argument \"%s\" ignored\n",
+							my_commands->argv[0], my_commands->argv[j]);
+			}
 		}
 		else if (pg_strcasecmp(my_commands->argv[0], "set") == 0)
 		{
@@ -2178,6 +2422,18 @@ process_builtin(char *tb)
 	return my_commands;
 }
 
+/*
+ * compute the probability of the truncated exponential random generation
+ * to draw values in the i-th slot of the range.
+ */
+static double exponentialProbability(int i, int slots, double threshold)
+{
+	assert(1 <= i && i <= slots);
+	return (exp(- threshold * (i - 1) / slots) - exp(- threshold * i / slots)) /
+		(1.0 - exp(- threshold));
+}
+
+
 /* print out results */
 static void
 printResults(int ttype, int64 normal_xacts, int nclients,
@@ -2197,16 +2453,69 @@ printResults(int ttype, int64 normal_xacts, int nclients,
 						(INSTR_TIME_GET_DOUBLE(conn_total_time) / nthreads));
 
 	if (ttype == 0)
-		s = "TPC-B (sort of)";
+	{
+		if (use_gaussian)
+			s = "Gaussian distribution TPC-B (sort of)";
+		else if (use_exponential)
+			s = "Exponential distribution TPC-B (sort of)";
+		else
+			s = "TPC-B (sort of)";
+	}
 	else if (ttype == 2)
-		s = "Update only pgbench_accounts";
+	{
+		if (use_gaussian)
+			s = "Gaussian distribution update only pgbench_accounts";
+		else if (use_exponential)
+			s = "Exponential distribution update only pgbench_accounts";
+		else
+			s = "Update only pgbench_accounts";
+	}
 	else if (ttype == 1)
-		s = "SELECT only";
+	{
+		if (use_gaussian)
+			s = "Gaussian distribution SELECT only";
+		else if (use_exponential)
+			s = "Exponential distribution SELECT only";
+		else
+			s = "SELECT only";
+	}
 	else
 		s = "Custom query";
 
 	printf("transaction type: %s\n", s);
 	printf("scaling factor: %d\n", scale);
+
+	/* output in gaussian distribution benchmark */
+	if (use_gaussian)
+	{
+		int i;
+		printf("standard deviation threshold: %.5f\n", stdev_threshold);
+		printf("decile percents:");
+		for (i = 2; i <= 20; i = i + 2)
+			printf(" %.1f%%", (double) 50 * (erf (stdev_threshold * (1 - 0.1 * (i - 2)) / sqrt(2.0)) -
+				erf (stdev_threshold * (1 - 0.1 * i) / sqrt(2.0))) /
+				erf (stdev_threshold / sqrt(2.0)));
+		printf("\n");
+//		printf("access probability of top 20%%, 10%% and 5%% records: %.5f %.5f %.5f\n",
+//			(double) ((erf (stdev_threshold * 0.2 / sqrt(2.0))) / (erf (stdev_threshold / sqrt(2.0)))),
+//			(double) ((erf (stdev_threshold * 0.1 / sqrt(2.0))) / (erf (stdev_threshold / sqrt(2.0)))),
+//			(double) ((erf (stdev_threshold * 0.05 / sqrt(2.0))) / (erf (stdev_threshold / sqrt(2.0)))));
+	}
+	/* output in exponential distribution benchmark */
+	else if (use_exponential)
+	{
+		int i;
+		printf("exponential threshold: %.5f\n", exp_threshold);
+		printf("decile percents:");
+		for (i = 1; i <= 10; i++)
+			printf(" %.1f%%",
+				   100.0 * exponentialProbability(i, 10, exp_threshold));
+		printf("\n");
+		printf("highest/lowest percent of the range: %.1f%% %.1f%%\n",
+			   100.0 * exponentialProbability(1, 100, exp_threshold),
+			   100.0 * exponentialProbability(100, 100, exp_threshold));
+	}
+
 	printf("query mode: %s\n", QUERYMODE[querymode]);
 	printf("number of clients: %d\n", nclients);
 	printf("number of threads: %d\n", nthreads);
@@ -2337,6 +2646,8 @@ main(int argc, char **argv)
 		{"unlogged-tables", no_argument, &unlogged_tables, 1},
 		{"sampling-rate", required_argument, NULL, 4},
 		{"aggregate-interval", required_argument, NULL, 5},
+		{"gaussian", required_argument, NULL, 6},
+		{"exponential", required_argument, NULL, 7},
 		{"rate", required_argument, NULL, 'R'},
 		{NULL, 0, NULL, 0}
 	};
@@ -2617,6 +2928,25 @@ main(int argc, char **argv)
 				}
 #endif
 				break;
+			case 6:
+				use_gaussian = true;
+				stdev_threshold = atof(optarg);
+				if(stdev_threshold < MIN_GAUSSIAN_THRESHOLD)
+				{
+					fprintf(stderr, "--gaussian=NUM must be more than %f: %f\n",
+							MIN_GAUSSIAN_THRESHOLD, stdev_threshold);
+					exit(1);
+				}
+				break;
+			case 7:
+				use_exponential = true;
+				exp_threshold = atof(optarg);
+				if(exp_threshold <= 0.0)
+				{
+					fprintf(stderr, "--exponential=NUM must be more 0.0\n");
+					exit(1);
+				}
+				break;
 			default:
 				fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 				exit(1);
@@ -2814,6 +3144,28 @@ main(int argc, char **argv)
 		}
 	}
 
+	/* set :stdev_threshold variable */
+	if(getVariable(&state[0], "stdev_threshold") == NULL)
+	{
+		snprintf(val, sizeof(val), "%lf", stdev_threshold);
+		for (i = 0; i < nclients; i++)
+		{
+			if (!putVariable(&state[i], "startup", "stdev_threshold", val))
+				exit(1);
+		}
+	}
+
+	/* set :exp_threshold variable */
+	if(getVariable(&state[0], "exp_threshold") == NULL)
+	{
+		snprintf(val, sizeof(val), "%lf", exp_threshold);
+		for (i = 0; i < nclients; i++)
+		{
+			if (!putVariable(&state[i], "startup", "exp_threshold", val))
+				exit(1);
+		}
+	}
+
 	if (!is_no_vacuum)
 	{
 		fprintf(stderr, "starting vacuum...");
@@ -2839,17 +3191,32 @@ main(int argc, char **argv)
 	switch (ttype)
 	{
 		case 0:
-			sql_files[0] = process_builtin(tpc_b);
+			if (use_gaussian)
+				sql_files[0] = process_builtin(gaussian_tpc_b);
+			else if (use_exponential)
+				sql_files[0] = process_builtin(exponential_tpc_b);
+			else
+				sql_files[0] = process_builtin(tpc_b);
 			num_files = 1;
 			break;
 
 		case 1:
-			sql_files[0] = process_builtin(select_only);
+			if (use_gaussian)
+				sql_files[0] = process_builtin(gaussian_select_only);
+			else if (use_exponential)
+				sql_files[0] = process_builtin(exponential_select_only);
+			else
+				sql_files[0] = process_builtin(select_only);
 			num_files = 1;
 			break;
 
 		case 2:
-			sql_files[0] = process_builtin(simple_update);
+			if (use_gaussian)
+				sql_files[0] = process_builtin(gaussian_simple_update);
+			else if (use_exponential)
+				sql_files[0] = process_builtin(exponential_simple_update);
+			else
+				sql_files[0] = process_builtin(simple_update);
 			num_files = 1;
 			break;
 
diff --git a/doc/src/sgml/pgbench.sgml b/doc/src/sgml/pgbench.sgml
index 4367563..3a561a9 100644
--- a/doc/src/sgml/pgbench.sgml
+++ b/doc/src/sgml/pgbench.sgml
@@ -307,6 +307,21 @@ pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</>
      </varlistentry>
 
      <varlistentry>
+      <term><option>--exponential</option><replaceable>threshold</></term>
+      <listitem>
+       <para>
+         Run exponential distribution pgbench test using this threshold parameter.
+         The threshold controls the distribution of access frequency on the
+         <structname>pgbench_accounts</> table.
+         See the <literal>\setrandom</> documentation below for details about
+         the impact of the threshold value.
+         When set, this option applies to all test variants (<option>-N</> for
+         skipping updates, or <option>-S</> for selects).
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><option>-f</option> <replaceable>filename</></term>
       <term><option>--file=</option><replaceable>filename</></term>
       <listitem>
@@ -320,6 +335,21 @@ pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</>
      </varlistentry>
 
      <varlistentry>
+      <term><option>--gaussian</option><replaceable>threshold</></term>
+      <listitem>
+       <para>
+         Run gaussian distribution pgbench test using this threshold parameter.
+         The threshold controls the distribution of access frequency on the
+         <structname>pgbench_accounts</> table.
+         See the <literal>\setrandom</> documentation below for details about
+         the impact of the threshold value.
+         When set, this option applies to all test variants (<option>-N</> for
+         skipping updates, or <option>-S</> for selects).
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><option>-j</option> <replaceable>threads</></term>
       <term><option>--jobs=</option><replaceable>threads</></term>
       <listitem>
@@ -748,8 +778,8 @@ pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</>
 
    <varlistentry>
     <term>
-     <literal>\setrandom <replaceable>varname</> <replaceable>min</> <replaceable>max</></literal>
-    </term>
+     <literal>\setrandom <replaceable>varname</> <replaceable>min</> <replaceable>max</> [ uniform | [ { gaussian | exponential } <replaceable>threshold</> ] ]</literal>
+     </term>
 
     <listitem>
      <para>
@@ -761,9 +791,44 @@ pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</>
      </para>
 
      <para>
+      The default random distribution is uniform. The gaussian and exponential
+      options allow to change the distribution. The mandatory
+      <replaceable>threshold</> double value controls the actual distribution.
+     </para>
+
+     <para>
+      With the gaussian option, the larger the <replaceable>threshold</>,
+      the more frequently values close to the middle of the interval are drawn,
+      and the less frequently values close to the <replaceable>min</> and
+      <replaceable>max</> bounds.
+      In other worlds, the larger the <replaceable>threshold</>,
+      the narrower the access range around the middle.
+      the smaller the threshold, the smoother the access pattern
+      distribution. The minimum threshold is 2.0 for performance.
+     </para>
+
+     <para>
+      With the exponential option, the <replaceable>threshold</> parameter
+      controls the distribution by truncating an exponential distribution at
+      a specific value, and then projecting onto integers between the bounds.
+      To be precise, the <replaceable>threshold</> is so that the density of
+      probability of the exponential distribution at the <replaceable>max</>
+      cut-off value is exp(-threshold), the density at the <replaceable>min</>
+      value being 1.
+      Intuitively, the larger the threshold, the more frequently values close to
+      <replaceable>min</> are accessed, and the less frequently values close to
+      <replaceable>max</> are accessed.
+      A crude approximation of the distribution is that the most frequent 1%
+      values are drawn <replaceable>threshold</>% of the time.
+      The closer to 0.0 the threshold, the flatter (more uniform) the access
+      distribution.
+      The threshold value must be strictly positive with the exponential option.
+     </para>
+
+     <para>
       Example:
 <programlisting>
-\setrandom aid 1 :naccounts
+\setrandom aid 1 :naccounts gaussian 5.0
 </programlisting></para>
     </listitem>
    </varlistentry>
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to