Hello Fabien-san,
I have checked your v13 patch, and tested the new exponential distribution
generating algorithm. It works fine and less or no overhead than previous
version.
Great work! And I agree with your proposal.
And I'm also interested in your "decile percents" output like under
followings,
> [nttcom@localhost postgresql]$ contrib/pgbench/pgbench --exponential=20
> ~
> decile percents: 86.5% 11.7% 1.6% 0.2% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0%
> ~
> [nttcom@localhost postgresql]$ contrib/pgbench/pgbench --exponential=10
> ~
> decile percents: 63.2% 23.3% 8.6% 3.1% 1.2% 0.4% 0.2% 0.1% 0.0% 0.0%
> ~
> [nttcom@localhost postgresql]$ contrib/pgbench/pgbench --exponential=5
> ~
> decile percents: 39.6% 24.0% 14.6% 8.8% 5.4% 3.3% 2.0% 1.2% 0.7% 0.4%
> ~
I think that it is easy to understand exponential distribution when I check
the exponential parameter. I also agree with it. So I create decile
percents output
in gaussian distribution.
Here are the examples.
> [nttcom@localhost postgresql]$ contrib/pgbench/pgbench --gaussian=20
> ~
> decile percents: 0.0% 0.0% 0.0% 0.0% 50.0% 50.0% 0.0% 0.0% 0.0% 0.0%
> ~
> [nttcom@localhost postgresql]$ contrib/pgbench/pgbench --gaussian=10
> ~
> decile percents: 0.0% 0.0% 0.0% 2.3% 47.7% 47.7% 2.3% 0.0% 0.0% 0.0%
> ~
> [nttcom@localhost postgresql]$ contrib/pgbench/pgbench --gaussian=5
> ~
> decile percents: 0.0% 0.1% 2.1% 13.6% 34.1% 34.1% 13.6% 2.1% 0.1% 0.0%
I think that it is easier than before. Sum of decile percents is just 100%.
However, I don't prefer "highest/lowest percentage" because it will be
confused
with decile percentage for users, and anyone cannot understand this
digits.
Here is example when sets exponential=5,
> [nttcom@localhost postgresql]$ contrib/pgbench/pgbench --exponential=5
> ~
> decile percents: 39.6% 24.0% 14.6% 8.8% 5.4% 3.3% 2.0% 1.2% 0.7% 0.4%
> highest/lowest percent of the range: 4.9% 0.0%
> ~
I cannot understand "4.9%, 0.0%" when I see the first time.
Then, I checked the source code, I understood it:( It's not good design...
#Why this parameter use 100?
So I'd like to remove it if you like. It will be more simple.
Attached patch is fixed version, please confirm it.
#Of course, World Cup is being held now. I'm not hurry at all.
Best regards,
--
Mitsumasa KONDO
*** a/contrib/pgbench/pgbench.c
--- b/contrib/pgbench/pgbench.c
***************
*** 41,46 ****
--- 41,47 ----
#include <math.h>
#include <signal.h>
#include <sys/time.h>
+ #include <assert.h>
#ifdef HAVE_SYS_SELECT_H
#include <sys/select.h>
#endif
***************
*** 98,103 **** static int pthread_join(pthread_t th, void **thread_return);
--- 99,106 ----
#define LOG_STEP_SECONDS 5 /* seconds between log messages */
#define DEFAULT_NXACTS 10 /* default nxacts */
+ #define MIN_GAUSSIAN_THRESHOLD 2.0 /* minimum threshold for gauss */
+
int nxacts = 0; /* number of transactions per client */
int duration = 0; /* duration in seconds */
***************
*** 171,176 **** bool is_connect; /* establish connection for each transaction */
--- 174,187 ----
bool is_latencies; /* report per-command latencies */
int main_pid; /* main process id used in log filename */
+ /* gaussian distribution tests: */
+ double stdev_threshold; /* standard deviation threshold */
+ bool use_gaussian = false;
+
+ /* exponential distribution tests: */
+ double exp_threshold; /* threshold for exponential */
+ bool use_exponential = false;
+
char *pghost = "";
char *pgport = "";
char *login = NULL;
***************
*** 332,337 **** static char *select_only = {
--- 343,430 ----
"SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n"
};
+ /* --exponential case */
+ static char *exponential_tpc_b = {
+ "\\set nbranches " CppAsString2(nbranches) " * :scale\n"
+ "\\set ntellers " CppAsString2(ntellers) " * :scale\n"
+ "\\set naccounts " CppAsString2(naccounts) " * :scale\n"
+ "\\setrandom aid 1 :naccounts exponential :exp_threshold\n"
+ "\\setrandom bid 1 :nbranches\n"
+ "\\setrandom tid 1 :ntellers\n"
+ "\\setrandom delta -5000 5000\n"
+ "BEGIN;\n"
+ "UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;\n"
+ "SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n"
+ "UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;\n"
+ "UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;\n"
+ "INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);\n"
+ "END;\n"
+ };
+
+ /* --exponential with -N case */
+ static char *exponential_simple_update = {
+ "\\set nbranches " CppAsString2(nbranches) " * :scale\n"
+ "\\set ntellers " CppAsString2(ntellers) " * :scale\n"
+ "\\set naccounts " CppAsString2(naccounts) " * :scale\n"
+ "\\setrandom aid 1 :naccounts exponential :exp_threshold\n"
+ "\\setrandom bid 1 :nbranches\n"
+ "\\setrandom tid 1 :ntellers\n"
+ "\\setrandom delta -5000 5000\n"
+ "BEGIN;\n"
+ "UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;\n"
+ "SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n"
+ "INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);\n"
+ "END;\n"
+ };
+
+ /* --exponential with -S case */
+ static char *exponential_select_only = {
+ "\\set naccounts " CppAsString2(naccounts) " * :scale\n"
+ "\\setrandom aid 1 :naccounts exponential :exp_threshold\n"
+ "SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n"
+ };
+
+ /* --gaussian case */
+ static char *gaussian_tpc_b = {
+ "\\set nbranches " CppAsString2(nbranches) " * :scale\n"
+ "\\set ntellers " CppAsString2(ntellers) " * :scale\n"
+ "\\set naccounts " CppAsString2(naccounts) " * :scale\n"
+ "\\setrandom aid 1 :naccounts gaussian :stdev_threshold\n"
+ "\\setrandom bid 1 :nbranches\n"
+ "\\setrandom tid 1 :ntellers\n"
+ "\\setrandom delta -5000 5000\n"
+ "BEGIN;\n"
+ "UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;\n"
+ "SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n"
+ "UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;\n"
+ "UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;\n"
+ "INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);\n"
+ "END;\n"
+ };
+
+ /* --gaussian with -N case */
+ static char *gaussian_simple_update = {
+ "\\set nbranches " CppAsString2(nbranches) " * :scale\n"
+ "\\set ntellers " CppAsString2(ntellers) " * :scale\n"
+ "\\set naccounts " CppAsString2(naccounts) " * :scale\n"
+ "\\setrandom aid 1 :naccounts gaussian :stdev_threshold\n"
+ "\\setrandom bid 1 :nbranches\n"
+ "\\setrandom tid 1 :ntellers\n"
+ "\\setrandom delta -5000 5000\n"
+ "BEGIN;\n"
+ "UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;\n"
+ "SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n"
+ "INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);\n"
+ "END;\n"
+ };
+
+ /* --gaussian with -S case */
+ static char *gaussian_select_only = {
+ "\\set naccounts " CppAsString2(naccounts) " * :scale\n"
+ "\\setrandom aid 1 :naccounts gaussian :stdev_threshold\n"
+ "SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n"
+ };
+
/* Function prototypes */
static void setalarm(int seconds);
static void *threadRun(void *arg);
***************
*** 375,380 **** usage(void)
--- 468,475 ----
" -v, --vacuum-all vacuum all four standard tables before tests\n"
" --aggregate-interval=NUM aggregate data over NUM seconds\n"
" --sampling-rate=NUM fraction of transactions to log (e.g. 0.01 for 1%%)\n"
+ " --exponential=NUM exponential distribution with NUM threshold parameter\n"
+ " --gaussian=NUM gaussian distribution with NUM threshold parameter\n"
"\nCommon options:\n"
" -d, --debug print debugging output\n"
" -h, --host=HOSTNAME database server host or socket directory\n"
***************
*** 471,476 **** getrand(TState *thread, int64 min, int64 max)
--- 566,641 ----
return min + (int64) ((max - min + 1) * pg_erand48(thread->random_state));
}
+ /*
+ * random number generator: exponential distribution from min to max inclusive.
+ * the threshold is so that the density of probability for the last cut-off max
+ * value is exp(-exp_threshold).
+ */
+ static int64
+ getExponentialrand(TState *thread, int64 min, int64 max, double exp_threshold)
+ {
+ double cut, uniform, rand;
+ assert(exp_threshold > 0.0);
+ cut = exp(-exp_threshold);
+ /* erand in [0, 1), uniform in (0, 1] */
+ uniform = 1.0 - pg_erand48(thread->random_state);
+ /*
+ * inner expresion in (cut, 1] (if exp_threshold > 0),
+ * rand in [0, 1)
+ */
+ assert((1.0 - cut) != 0.0);
+ rand = - log(cut + (1.0 - cut) * uniform) / exp_threshold;
+ /* return int64 random number within between min and max */
+ return min + (int64)((max - min + 1) * rand);
+ }
+
+ /* random number generator: gaussian distribution from min to max inclusive */
+ static int64
+ getGaussianrand(TState *thread, int64 min, int64 max, double stdev_threshold)
+ {
+ double stdev;
+ double rand;
+
+ /*
+ * Get user specified random number from this loop, with
+ * -stdev_threshold < stdev <= stdev_threshold
+ *
+ * This loop is executed until the number is in the expected range.
+ *
+ * As the minimum threshold is 2.0, the probability of looping is low:
+ * sqrt(-2 ln(r)) <= 2 => r >= e^{-2} ~ 0.135, then when taking the average
+ * sinus multiplier as 2/pi, we have a 8.6% looping probability in the
+ * worst case. For a 5.0 threshold value, the looping proability
+ * is about e^{-5} * 2 / pi ~ 0.43%.
+ */
+ do
+ {
+ /*
+ * pg_erand48 generates [0,1), but for the basic version of the
+ * Box-Muller transform the two uniformly distributed random numbers
+ * are expected in (0, 1] (see http://en.wikipedia.org/wiki/Box_muller)
+ */
+ double rand1 = 1.0 - pg_erand48(thread->random_state);
+ double rand2 = 1.0 - pg_erand48(thread->random_state);
+
+ /* Box-Muller basic form transform */
+ double var_sqrt = sqrt(-2.0 * log(rand1));
+ stdev = var_sqrt * sin(2.0 * M_PI * rand2);
+
+ /*
+ * we may try with cos, but there may be a bias induced if the previous
+ * value fails the test? To be on the safe side, let us try over.
+ */
+ }
+ while (stdev < -stdev_threshold || stdev >= stdev_threshold);
+
+ /* stdev is in [-threshold, threshold), normalization to [0,1) */
+ rand = (stdev + stdev_threshold) / (stdev_threshold * 2.0);
+
+ /* return int64 random number within between min and max */
+ return min + (int64)((max - min + 1) * rand);
+ }
+
/* call PQexec() and exit() on failure */
static void
executeStatement(PGconn *con, const char *sql)
***************
*** 1319,1324 **** top:
--- 1484,1490 ----
char *var;
int64 min,
max;
+ double threshold = 0;
char res[64];
if (*argv[2] == ':')
***************
*** 1364,1374 **** top:
}
/*
! * getrand() needs to be able to subtract max from min and add one
! * to the result without overflowing. Since we know max > min, we
! * can detect overflow just by checking for a negative result. But
! * we must check both that the subtraction doesn't overflow, and
! * that adding one to the result doesn't overflow either.
*/
if (max - min < 0 || (max - min) + 1 < 0)
{
--- 1530,1540 ----
}
/*
! * Generate random number functions need to be able to subtract
! * max from min and add one to the result without overflowing.
! * Since we know max > min, we can detect overflow just by checking
! * for a negative result. But we must check both that the subtraction
! * doesn't overflow, and that adding one to the result doesn't overflow either.
*/
if (max - min < 0 || (max - min) + 1 < 0)
{
***************
*** 1377,1386 **** top:
return true;
}
#ifdef DEBUG
! printf("min: " INT64_FORMAT " max: " INT64_FORMAT " random: " INT64_FORMAT "\n", min, max, getrand(thread, min, max));
#endif
! snprintf(res, sizeof(res), INT64_FORMAT, getrand(thread, min, max));
if (!putVariable(st, argv[0], argv[1], res))
{
--- 1543,1605 ----
return true;
}
+ if (argc == 4) /* uniform */
+ {
#ifdef DEBUG
! printf("min: " INT64_FORMAT " max: " INT64_FORMAT " random: " INT64_FORMAT "\n", min, max, getrand(thread, min, max));
#endif
! snprintf(res, sizeof(res), INT64_FORMAT, getrand(thread, min, max));
! }
! else if ((pg_strcasecmp(argv[4], "gaussian") == 0) ||
! (pg_strcasecmp(argv[4], "exponential") == 0))
! {
! if (*argv[5] == ':')
! {
! if ((var = getVariable(st, argv[5] + 1)) == NULL)
! {
! fprintf(stderr, "%s: invalid threshold number %s\n", argv[0], argv[5]);
! st->ecnt++;
! return true;
! }
! threshold = strtod(var, NULL);
! }
! else
! threshold = strtod(argv[5], NULL);
!
! if (pg_strcasecmp(argv[4], "gaussian") == 0)
! {
! if (threshold < MIN_GAUSSIAN_THRESHOLD)
! {
! fprintf(stderr, "%s: gaussian threshold must be more than %f\n,", argv[5], MIN_GAUSSIAN_THRESHOLD);
! st->ecnt++;
! return true;
! }
! #ifdef DEBUG
! printf("min: " INT64_FORMAT " max: " INT64_FORMAT " random: " INT64_FORMAT "\n", min, max, getGaussianrand(thread, min, max, threshold));
! #endif
! snprintf(res, sizeof(res), INT64_FORMAT, getGaussianrand(thread, min, max, threshold));
! }
! else if (pg_strcasecmp(argv[4], "exponential") == 0)
! {
! if (threshold <= 0.0)
! {
! fprintf(stderr, "%s: exponential threshold must be strictly positive\n,", argv[5]);
! st->ecnt++;
! return true;
! }
! #ifdef DEBUG
! printf("min: " INT64_FORMAT " max: " INT64_FORMAT " random: " INT64_FORMAT "\n", min, max, getExponentialrand(thread, min, max, threshold));
! #endif
! snprintf(res, sizeof(res), INT64_FORMAT, getExponentialrand(thread, min, max, threshold));
! }
! }
! else /* uniform with extra arguments */
! {
! #ifdef DEBUG
! printf("min: " INT64_FORMAT " max: " INT64_FORMAT " random: " INT64_FORMAT "\n", min, max, getrand(thread, min, max));
! #endif
! snprintf(res, sizeof(res), INT64_FORMAT, getrand(thread, min, max));
! }
if (!putVariable(st, argv[0], argv[1], res))
{
***************
*** 1920,1928 **** process_commands(char *buf)
exit(1);
}
! for (j = 4; j < my_commands->argc; j++)
! fprintf(stderr, "%s: extra argument \"%s\" ignored\n",
! my_commands->argv[0], my_commands->argv[j]);
}
else if (pg_strcasecmp(my_commands->argv[0], "set") == 0)
{
--- 2139,2172 ----
exit(1);
}
! if (my_commands->argc == 4 ) /* uniform */
! {
! /* nothing to do */
! }
! else if ((pg_strcasecmp(my_commands->argv[4], "gaussian") == 0) ||
! (pg_strcasecmp(my_commands->argv[4], "exponential") == 0))
! {
! if (my_commands->argc < 6)
! {
! fprintf(stderr, "%s(%s): missing argument\n", my_commands->argv[0], my_commands->argv[4]);
! exit(1);
! }
!
! for (j = 6; j < my_commands->argc; j++)
! fprintf(stderr, "%s(%s): extra argument \"%s\" ignored\n",
! my_commands->argv[0], my_commands->argv[4], my_commands->argv[j]);
! }
! else /* uniform with extra argument */
! {
! int arg_pos = 4;
!
! if (pg_strcasecmp(my_commands->argv[4], "uniform") == 0)
! arg_pos++;
!
! for (j = arg_pos; j < my_commands->argc; j++)
! fprintf(stderr, "%s(uniform): extra argument \"%s\" ignored\n",
! my_commands->argv[0], my_commands->argv[j]);
! }
}
else if (pg_strcasecmp(my_commands->argv[0], "set") == 0)
{
***************
*** 2178,2183 **** process_builtin(char *tb)
--- 2422,2439 ----
return my_commands;
}
+ /*
+ * compute the probability of the truncated exponential random generation
+ * to draw values in the i-th slot of the range.
+ */
+ static double exponentialProbability(int i, int slots, double threshold)
+ {
+ assert(1 <= i && i <= slots);
+ return (exp(- threshold * (i - 1) / slots) - exp(- threshold * i / slots)) /
+ (1.0 - exp(- threshold));
+ }
+
+
/* print out results */
static void
printResults(int ttype, int64 normal_xacts, int nclients,
***************
*** 2197,2212 **** printResults(int ttype, int64 normal_xacts, int nclients,
(INSTR_TIME_GET_DOUBLE(conn_total_time) / nthreads));
if (ttype == 0)
! s = "TPC-B (sort of)";
else if (ttype == 2)
! s = "Update only pgbench_accounts";
else if (ttype == 1)
! s = "SELECT only";
else
s = "Custom query";
printf("transaction type: %s\n", s);
printf("scaling factor: %d\n", scale);
printf("query mode: %s\n", QUERYMODE[querymode]);
printf("number of clients: %d\n", nclients);
printf("number of threads: %d\n", nthreads);
--- 2453,2521 ----
(INSTR_TIME_GET_DOUBLE(conn_total_time) / nthreads));
if (ttype == 0)
! {
! if (use_gaussian)
! s = "Gaussian distribution TPC-B (sort of)";
! else if (use_exponential)
! s = "Exponential distribution TPC-B (sort of)";
! else
! s = "TPC-B (sort of)";
! }
else if (ttype == 2)
! {
! if (use_gaussian)
! s = "Gaussian distribution update only pgbench_accounts";
! else if (use_exponential)
! s = "Exponential distribution update only pgbench_accounts";
! else
! s = "Update only pgbench_accounts";
! }
else if (ttype == 1)
! {
! if (use_gaussian)
! s = "Gaussian distribution SELECT only";
! else if (use_exponential)
! s = "Exponential distribution SELECT only";
! else
! s = "SELECT only";
! }
else
s = "Custom query";
printf("transaction type: %s\n", s);
printf("scaling factor: %d\n", scale);
+
+ /* output in gaussian distribution benchmark */
+ if (use_gaussian)
+ {
+ int i;
+ printf("standard deviation threshold: %.5f\n", stdev_threshold);
+ printf("decile percents:");
+ for (i = 2; i <= 20; i = i + 2)
+ printf(" %.1f%%", (double) 50 * (erf (stdev_threshold * (1 - 0.1 * (i - 2)) / sqrt(2.0)) -
+ erf (stdev_threshold * (1 - 0.1 * i) / sqrt(2.0))) /
+ erf (stdev_threshold / sqrt(2.0)));
+ printf("\n");
+ // printf("access probability of top 20%%, 10%% and 5%% records: %.5f %.5f %.5f\n",
+ // (double) ((erf (stdev_threshold * 0.2 / sqrt(2.0))) / (erf (stdev_threshold / sqrt(2.0)))),
+ // (double) ((erf (stdev_threshold * 0.1 / sqrt(2.0))) / (erf (stdev_threshold / sqrt(2.0)))),
+ // (double) ((erf (stdev_threshold * 0.05 / sqrt(2.0))) / (erf (stdev_threshold / sqrt(2.0)))));
+ }
+ /* output in exponential distribution benchmark */
+ else if (use_exponential)
+ {
+ int i;
+ printf("exponential threshold: %.5f\n", exp_threshold);
+ printf("decile percents:");
+ for (i = 1; i <= 10; i++)
+ printf(" %.1f%%",
+ 100.0 * exponentialProbability(i, 10, exp_threshold));
+ printf("\n");
+ printf("highest/lowest percent of the range: %.1f%% %.1f%%\n",
+ 100.0 * exponentialProbability(1, 100, exp_threshold),
+ 100.0 * exponentialProbability(100, 100, exp_threshold));
+ }
+
printf("query mode: %s\n", QUERYMODE[querymode]);
printf("number of clients: %d\n", nclients);
printf("number of threads: %d\n", nthreads);
***************
*** 2337,2342 **** main(int argc, char **argv)
--- 2646,2653 ----
{"unlogged-tables", no_argument, &unlogged_tables, 1},
{"sampling-rate", required_argument, NULL, 4},
{"aggregate-interval", required_argument, NULL, 5},
+ {"gaussian", required_argument, NULL, 6},
+ {"exponential", required_argument, NULL, 7},
{"rate", required_argument, NULL, 'R'},
{NULL, 0, NULL, 0}
};
***************
*** 2617,2622 **** main(int argc, char **argv)
--- 2928,2952 ----
}
#endif
break;
+ case 6:
+ use_gaussian = true;
+ stdev_threshold = atof(optarg);
+ if(stdev_threshold < MIN_GAUSSIAN_THRESHOLD)
+ {
+ fprintf(stderr, "--gaussian=NUM must be more than %f: %f\n",
+ MIN_GAUSSIAN_THRESHOLD, stdev_threshold);
+ exit(1);
+ }
+ break;
+ case 7:
+ use_exponential = true;
+ exp_threshold = atof(optarg);
+ if(exp_threshold <= 0.0)
+ {
+ fprintf(stderr, "--exponential=NUM must be more 0.0\n");
+ exit(1);
+ }
+ break;
default:
fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
exit(1);
***************
*** 2814,2819 **** main(int argc, char **argv)
--- 3144,3171 ----
}
}
+ /* set :stdev_threshold variable */
+ if(getVariable(&state[0], "stdev_threshold") == NULL)
+ {
+ snprintf(val, sizeof(val), "%lf", stdev_threshold);
+ for (i = 0; i < nclients; i++)
+ {
+ if (!putVariable(&state[i], "startup", "stdev_threshold", val))
+ exit(1);
+ }
+ }
+
+ /* set :exp_threshold variable */
+ if(getVariable(&state[0], "exp_threshold") == NULL)
+ {
+ snprintf(val, sizeof(val), "%lf", exp_threshold);
+ for (i = 0; i < nclients; i++)
+ {
+ if (!putVariable(&state[i], "startup", "exp_threshold", val))
+ exit(1);
+ }
+ }
+
if (!is_no_vacuum)
{
fprintf(stderr, "starting vacuum...");
***************
*** 2839,2855 **** main(int argc, char **argv)
switch (ttype)
{
case 0:
! sql_files[0] = process_builtin(tpc_b);
num_files = 1;
break;
case 1:
! sql_files[0] = process_builtin(select_only);
num_files = 1;
break;
case 2:
! sql_files[0] = process_builtin(simple_update);
num_files = 1;
break;
--- 3191,3222 ----
switch (ttype)
{
case 0:
! if (use_gaussian)
! sql_files[0] = process_builtin(gaussian_tpc_b);
! else if (use_exponential)
! sql_files[0] = process_builtin(exponential_tpc_b);
! else
! sql_files[0] = process_builtin(tpc_b);
num_files = 1;
break;
case 1:
! if (use_gaussian)
! sql_files[0] = process_builtin(gaussian_select_only);
! else if (use_exponential)
! sql_files[0] = process_builtin(exponential_select_only);
! else
! sql_files[0] = process_builtin(select_only);
num_files = 1;
break;
case 2:
! if (use_gaussian)
! sql_files[0] = process_builtin(gaussian_simple_update);
! else if (use_exponential)
! sql_files[0] = process_builtin(exponential_simple_update);
! else
! sql_files[0] = process_builtin(simple_update);
num_files = 1;
break;
*** a/doc/src/sgml/pgbench.sgml
--- b/doc/src/sgml/pgbench.sgml
***************
*** 307,312 **** pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</>
--- 307,327 ----
</varlistentry>
<varlistentry>
+ <term><option>--exponential</option><replaceable>threshold</></term>
+ <listitem>
+ <para>
+ Run exponential distribution pgbench test using this threshold parameter.
+ The threshold controls the distribution of access frequency on the
+ <structname>pgbench_accounts</> table.
+ See the <literal>\setrandom</> documentation below for details about
+ the impact of the threshold value.
+ When set, this option applies to all test variants (<option>-N</> for
+ skipping updates, or <option>-S</> for selects).
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
<term><option>-f</option> <replaceable>filename</></term>
<term><option>--file=</option><replaceable>filename</></term>
<listitem>
***************
*** 320,325 **** pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</>
--- 335,355 ----
</varlistentry>
<varlistentry>
+ <term><option>--gaussian</option><replaceable>threshold</></term>
+ <listitem>
+ <para>
+ Run gaussian distribution pgbench test using this threshold parameter.
+ The threshold controls the distribution of access frequency on the
+ <structname>pgbench_accounts</> table.
+ See the <literal>\setrandom</> documentation below for details about
+ the impact of the threshold value.
+ When set, this option applies to all test variants (<option>-N</> for
+ skipping updates, or <option>-S</> for selects).
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
<term><option>-j</option> <replaceable>threads</></term>
<term><option>--jobs=</option><replaceable>threads</></term>
<listitem>
***************
*** 748,755 **** pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</>
<varlistentry>
<term>
! <literal>\setrandom <replaceable>varname</> <replaceable>min</> <replaceable>max</></literal>
! </term>
<listitem>
<para>
--- 778,785 ----
<varlistentry>
<term>
! <literal>\setrandom <replaceable>varname</> <replaceable>min</> <replaceable>max</> [ uniform | [ { gaussian | exponential } <replaceable>threshold</> ] ]</literal>
! </term>
<listitem>
<para>
***************
*** 761,769 **** pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</>
</para>
<para>
Example:
<programlisting>
! \setrandom aid 1 :naccounts
</programlisting></para>
</listitem>
</varlistentry>
--- 791,834 ----
</para>
<para>
+ The default random distribution is uniform. The gaussian and exponential
+ options allow to change the distribution. The mandatory
+ <replaceable>threshold</> double value controls the actual distribution.
+ </para>
+
+ <para>
+ With the gaussian option, the larger the <replaceable>threshold</>,
+ the more frequently values close to the middle of the interval are drawn,
+ and the less frequently values close to the <replaceable>min</> and
+ <replaceable>max</> bounds.
+ In other worlds, the larger the <replaceable>threshold</>,
+ the narrower the access range around the middle.
+ the smaller the threshold, the smoother the access pattern
+ distribution. The minimum threshold is 2.0 for performance.
+ </para>
+
+ <para>
+ With the exponential option, the <replaceable>threshold</> parameter
+ controls the distribution by truncating an exponential distribution at
+ a specific value, and then projecting onto integers between the bounds.
+ To be precise, the <replaceable>threshold</> is so that the density of
+ probability of the exponential distribution at the <replaceable>max</>
+ cut-off value is exp(-threshold), the density at the <replaceable>min</>
+ value being 1.
+ Intuitively, the larger the threshold, the more frequently values close to
+ <replaceable>min</> are accessed, and the less frequently values close to
+ <replaceable>max</> are accessed.
+ A crude approximation of the distribution is that the most frequent 1%
+ values are drawn <replaceable>threshold</>% of the time.
+ The closer to 0.0 the threshold, the flatter (more uniform) the access
+ distribution.
+ The threshold value must be strictly positive with the exponential option.
+ </para>
+
+ <para>
Example:
<programlisting>
! \setrandom aid 1 :naccounts gaussian 5.0
</programlisting></para>
</listitem>
</varlistentry>
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers