Hello Fabien-san, I have checked your v13 patch, and tested the new exponential distribution generating algorithm. It works fine and less or no overhead than previous version. Great work! And I agree with your proposal.
And I'm also interested in your "decile percents" output like under followings, > [nttcom@localhost postgresql]$ contrib/pgbench/pgbench --exponential=20 > ~ > decile percents: 86.5% 11.7% 1.6% 0.2% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% > ~ > [nttcom@localhost postgresql]$ contrib/pgbench/pgbench --exponential=10 > ~ > decile percents: 63.2% 23.3% 8.6% 3.1% 1.2% 0.4% 0.2% 0.1% 0.0% 0.0% > ~ > [nttcom@localhost postgresql]$ contrib/pgbench/pgbench --exponential=5 > ~ > decile percents: 39.6% 24.0% 14.6% 8.8% 5.4% 3.3% 2.0% 1.2% 0.7% 0.4% > ~ I think that it is easy to understand exponential distribution when I check the exponential parameter. I also agree with it. So I create decile percents output in gaussian distribution. Here are the examples. > [nttcom@localhost postgresql]$ contrib/pgbench/pgbench --gaussian=20 > ~ > decile percents: 0.0% 0.0% 0.0% 0.0% 50.0% 50.0% 0.0% 0.0% 0.0% 0.0% > ~ > [nttcom@localhost postgresql]$ contrib/pgbench/pgbench --gaussian=10 > ~ > decile percents: 0.0% 0.0% 0.0% 2.3% 47.7% 47.7% 2.3% 0.0% 0.0% 0.0% > ~ > [nttcom@localhost postgresql]$ contrib/pgbench/pgbench --gaussian=5 > ~ > decile percents: 0.0% 0.1% 2.1% 13.6% 34.1% 34.1% 13.6% 2.1% 0.1% 0.0% I think that it is easier than before. Sum of decile percents is just 100%. However, I don't prefer "highest/lowest percentage" because it will be confused with decile percentage for users, and anyone cannot understand this digits. Here is example when sets exponential=5, > [nttcom@localhost postgresql]$ contrib/pgbench/pgbench --exponential=5 > ~ > decile percents: 39.6% 24.0% 14.6% 8.8% 5.4% 3.3% 2.0% 1.2% 0.7% 0.4% > highest/lowest percent of the range: 4.9% 0.0% > ~ I cannot understand "4.9%, 0.0%" when I see the first time. Then, I checked the source code, I understood it:( It's not good design... #Why this parameter use 100? So I'd like to remove it if you like. It will be more simple. Attached patch is fixed version, please confirm it. #Of course, World Cup is being held now. I'm not hurry at all. Best regards, -- Mitsumasa KONDO
*** a/contrib/pgbench/pgbench.c --- b/contrib/pgbench/pgbench.c *************** *** 41,46 **** --- 41,47 ---- #include <math.h> #include <signal.h> #include <sys/time.h> + #include <assert.h> #ifdef HAVE_SYS_SELECT_H #include <sys/select.h> #endif *************** *** 98,103 **** static int pthread_join(pthread_t th, void **thread_return); --- 99,106 ---- #define LOG_STEP_SECONDS 5 /* seconds between log messages */ #define DEFAULT_NXACTS 10 /* default nxacts */ + #define MIN_GAUSSIAN_THRESHOLD 2.0 /* minimum threshold for gauss */ + int nxacts = 0; /* number of transactions per client */ int duration = 0; /* duration in seconds */ *************** *** 171,176 **** bool is_connect; /* establish connection for each transaction */ --- 174,187 ---- bool is_latencies; /* report per-command latencies */ int main_pid; /* main process id used in log filename */ + /* gaussian distribution tests: */ + double stdev_threshold; /* standard deviation threshold */ + bool use_gaussian = false; + + /* exponential distribution tests: */ + double exp_threshold; /* threshold for exponential */ + bool use_exponential = false; + char *pghost = ""; char *pgport = ""; char *login = NULL; *************** *** 332,337 **** static char *select_only = { --- 343,430 ---- "SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n" }; + /* --exponential case */ + static char *exponential_tpc_b = { + "\\set nbranches " CppAsString2(nbranches) " * :scale\n" + "\\set ntellers " CppAsString2(ntellers) " * :scale\n" + "\\set naccounts " CppAsString2(naccounts) " * :scale\n" + "\\setrandom aid 1 :naccounts exponential :exp_threshold\n" + "\\setrandom bid 1 :nbranches\n" + "\\setrandom tid 1 :ntellers\n" + "\\setrandom delta -5000 5000\n" + "BEGIN;\n" + "UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;\n" + "SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n" + "UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;\n" + "UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;\n" + "INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);\n" + "END;\n" + }; + + /* --exponential with -N case */ + static char *exponential_simple_update = { + "\\set nbranches " CppAsString2(nbranches) " * :scale\n" + "\\set ntellers " CppAsString2(ntellers) " * :scale\n" + "\\set naccounts " CppAsString2(naccounts) " * :scale\n" + "\\setrandom aid 1 :naccounts exponential :exp_threshold\n" + "\\setrandom bid 1 :nbranches\n" + "\\setrandom tid 1 :ntellers\n" + "\\setrandom delta -5000 5000\n" + "BEGIN;\n" + "UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;\n" + "SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n" + "INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);\n" + "END;\n" + }; + + /* --exponential with -S case */ + static char *exponential_select_only = { + "\\set naccounts " CppAsString2(naccounts) " * :scale\n" + "\\setrandom aid 1 :naccounts exponential :exp_threshold\n" + "SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n" + }; + + /* --gaussian case */ + static char *gaussian_tpc_b = { + "\\set nbranches " CppAsString2(nbranches) " * :scale\n" + "\\set ntellers " CppAsString2(ntellers) " * :scale\n" + "\\set naccounts " CppAsString2(naccounts) " * :scale\n" + "\\setrandom aid 1 :naccounts gaussian :stdev_threshold\n" + "\\setrandom bid 1 :nbranches\n" + "\\setrandom tid 1 :ntellers\n" + "\\setrandom delta -5000 5000\n" + "BEGIN;\n" + "UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;\n" + "SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n" + "UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;\n" + "UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;\n" + "INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);\n" + "END;\n" + }; + + /* --gaussian with -N case */ + static char *gaussian_simple_update = { + "\\set nbranches " CppAsString2(nbranches) " * :scale\n" + "\\set ntellers " CppAsString2(ntellers) " * :scale\n" + "\\set naccounts " CppAsString2(naccounts) " * :scale\n" + "\\setrandom aid 1 :naccounts gaussian :stdev_threshold\n" + "\\setrandom bid 1 :nbranches\n" + "\\setrandom tid 1 :ntellers\n" + "\\setrandom delta -5000 5000\n" + "BEGIN;\n" + "UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;\n" + "SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n" + "INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);\n" + "END;\n" + }; + + /* --gaussian with -S case */ + static char *gaussian_select_only = { + "\\set naccounts " CppAsString2(naccounts) " * :scale\n" + "\\setrandom aid 1 :naccounts gaussian :stdev_threshold\n" + "SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n" + }; + /* Function prototypes */ static void setalarm(int seconds); static void *threadRun(void *arg); *************** *** 375,380 **** usage(void) --- 468,475 ---- " -v, --vacuum-all vacuum all four standard tables before tests\n" " --aggregate-interval=NUM aggregate data over NUM seconds\n" " --sampling-rate=NUM fraction of transactions to log (e.g. 0.01 for 1%%)\n" + " --exponential=NUM exponential distribution with NUM threshold parameter\n" + " --gaussian=NUM gaussian distribution with NUM threshold parameter\n" "\nCommon options:\n" " -d, --debug print debugging output\n" " -h, --host=HOSTNAME database server host or socket directory\n" *************** *** 471,476 **** getrand(TState *thread, int64 min, int64 max) --- 566,641 ---- return min + (int64) ((max - min + 1) * pg_erand48(thread->random_state)); } + /* + * random number generator: exponential distribution from min to max inclusive. + * the threshold is so that the density of probability for the last cut-off max + * value is exp(-exp_threshold). + */ + static int64 + getExponentialrand(TState *thread, int64 min, int64 max, double exp_threshold) + { + double cut, uniform, rand; + assert(exp_threshold > 0.0); + cut = exp(-exp_threshold); + /* erand in [0, 1), uniform in (0, 1] */ + uniform = 1.0 - pg_erand48(thread->random_state); + /* + * inner expresion in (cut, 1] (if exp_threshold > 0), + * rand in [0, 1) + */ + assert((1.0 - cut) != 0.0); + rand = - log(cut + (1.0 - cut) * uniform) / exp_threshold; + /* return int64 random number within between min and max */ + return min + (int64)((max - min + 1) * rand); + } + + /* random number generator: gaussian distribution from min to max inclusive */ + static int64 + getGaussianrand(TState *thread, int64 min, int64 max, double stdev_threshold) + { + double stdev; + double rand; + + /* + * Get user specified random number from this loop, with + * -stdev_threshold < stdev <= stdev_threshold + * + * This loop is executed until the number is in the expected range. + * + * As the minimum threshold is 2.0, the probability of looping is low: + * sqrt(-2 ln(r)) <= 2 => r >= e^{-2} ~ 0.135, then when taking the average + * sinus multiplier as 2/pi, we have a 8.6% looping probability in the + * worst case. For a 5.0 threshold value, the looping proability + * is about e^{-5} * 2 / pi ~ 0.43%. + */ + do + { + /* + * pg_erand48 generates [0,1), but for the basic version of the + * Box-Muller transform the two uniformly distributed random numbers + * are expected in (0, 1] (see http://en.wikipedia.org/wiki/Box_muller) + */ + double rand1 = 1.0 - pg_erand48(thread->random_state); + double rand2 = 1.0 - pg_erand48(thread->random_state); + + /* Box-Muller basic form transform */ + double var_sqrt = sqrt(-2.0 * log(rand1)); + stdev = var_sqrt * sin(2.0 * M_PI * rand2); + + /* + * we may try with cos, but there may be a bias induced if the previous + * value fails the test? To be on the safe side, let us try over. + */ + } + while (stdev < -stdev_threshold || stdev >= stdev_threshold); + + /* stdev is in [-threshold, threshold), normalization to [0,1) */ + rand = (stdev + stdev_threshold) / (stdev_threshold * 2.0); + + /* return int64 random number within between min and max */ + return min + (int64)((max - min + 1) * rand); + } + /* call PQexec() and exit() on failure */ static void executeStatement(PGconn *con, const char *sql) *************** *** 1319,1324 **** top: --- 1484,1490 ---- char *var; int64 min, max; + double threshold = 0; char res[64]; if (*argv[2] == ':') *************** *** 1364,1374 **** top: } /* ! * getrand() needs to be able to subtract max from min and add one ! * to the result without overflowing. Since we know max > min, we ! * can detect overflow just by checking for a negative result. But ! * we must check both that the subtraction doesn't overflow, and ! * that adding one to the result doesn't overflow either. */ if (max - min < 0 || (max - min) + 1 < 0) { --- 1530,1540 ---- } /* ! * Generate random number functions need to be able to subtract ! * max from min and add one to the result without overflowing. ! * Since we know max > min, we can detect overflow just by checking ! * for a negative result. But we must check both that the subtraction ! * doesn't overflow, and that adding one to the result doesn't overflow either. */ if (max - min < 0 || (max - min) + 1 < 0) { *************** *** 1377,1386 **** top: return true; } #ifdef DEBUG ! printf("min: " INT64_FORMAT " max: " INT64_FORMAT " random: " INT64_FORMAT "\n", min, max, getrand(thread, min, max)); #endif ! snprintf(res, sizeof(res), INT64_FORMAT, getrand(thread, min, max)); if (!putVariable(st, argv[0], argv[1], res)) { --- 1543,1605 ---- return true; } + if (argc == 4) /* uniform */ + { #ifdef DEBUG ! printf("min: " INT64_FORMAT " max: " INT64_FORMAT " random: " INT64_FORMAT "\n", min, max, getrand(thread, min, max)); #endif ! snprintf(res, sizeof(res), INT64_FORMAT, getrand(thread, min, max)); ! } ! else if ((pg_strcasecmp(argv[4], "gaussian") == 0) || ! (pg_strcasecmp(argv[4], "exponential") == 0)) ! { ! if (*argv[5] == ':') ! { ! if ((var = getVariable(st, argv[5] + 1)) == NULL) ! { ! fprintf(stderr, "%s: invalid threshold number %s\n", argv[0], argv[5]); ! st->ecnt++; ! return true; ! } ! threshold = strtod(var, NULL); ! } ! else ! threshold = strtod(argv[5], NULL); ! ! if (pg_strcasecmp(argv[4], "gaussian") == 0) ! { ! if (threshold < MIN_GAUSSIAN_THRESHOLD) ! { ! fprintf(stderr, "%s: gaussian threshold must be more than %f\n,", argv[5], MIN_GAUSSIAN_THRESHOLD); ! st->ecnt++; ! return true; ! } ! #ifdef DEBUG ! printf("min: " INT64_FORMAT " max: " INT64_FORMAT " random: " INT64_FORMAT "\n", min, max, getGaussianrand(thread, min, max, threshold)); ! #endif ! snprintf(res, sizeof(res), INT64_FORMAT, getGaussianrand(thread, min, max, threshold)); ! } ! else if (pg_strcasecmp(argv[4], "exponential") == 0) ! { ! if (threshold <= 0.0) ! { ! fprintf(stderr, "%s: exponential threshold must be strictly positive\n,", argv[5]); ! st->ecnt++; ! return true; ! } ! #ifdef DEBUG ! printf("min: " INT64_FORMAT " max: " INT64_FORMAT " random: " INT64_FORMAT "\n", min, max, getExponentialrand(thread, min, max, threshold)); ! #endif ! snprintf(res, sizeof(res), INT64_FORMAT, getExponentialrand(thread, min, max, threshold)); ! } ! } ! else /* uniform with extra arguments */ ! { ! #ifdef DEBUG ! printf("min: " INT64_FORMAT " max: " INT64_FORMAT " random: " INT64_FORMAT "\n", min, max, getrand(thread, min, max)); ! #endif ! snprintf(res, sizeof(res), INT64_FORMAT, getrand(thread, min, max)); ! } if (!putVariable(st, argv[0], argv[1], res)) { *************** *** 1920,1928 **** process_commands(char *buf) exit(1); } ! for (j = 4; j < my_commands->argc; j++) ! fprintf(stderr, "%s: extra argument \"%s\" ignored\n", ! my_commands->argv[0], my_commands->argv[j]); } else if (pg_strcasecmp(my_commands->argv[0], "set") == 0) { --- 2139,2172 ---- exit(1); } ! if (my_commands->argc == 4 ) /* uniform */ ! { ! /* nothing to do */ ! } ! else if ((pg_strcasecmp(my_commands->argv[4], "gaussian") == 0) || ! (pg_strcasecmp(my_commands->argv[4], "exponential") == 0)) ! { ! if (my_commands->argc < 6) ! { ! fprintf(stderr, "%s(%s): missing argument\n", my_commands->argv[0], my_commands->argv[4]); ! exit(1); ! } ! ! for (j = 6; j < my_commands->argc; j++) ! fprintf(stderr, "%s(%s): extra argument \"%s\" ignored\n", ! my_commands->argv[0], my_commands->argv[4], my_commands->argv[j]); ! } ! else /* uniform with extra argument */ ! { ! int arg_pos = 4; ! ! if (pg_strcasecmp(my_commands->argv[4], "uniform") == 0) ! arg_pos++; ! ! for (j = arg_pos; j < my_commands->argc; j++) ! fprintf(stderr, "%s(uniform): extra argument \"%s\" ignored\n", ! my_commands->argv[0], my_commands->argv[j]); ! } } else if (pg_strcasecmp(my_commands->argv[0], "set") == 0) { *************** *** 2178,2183 **** process_builtin(char *tb) --- 2422,2439 ---- return my_commands; } + /* + * compute the probability of the truncated exponential random generation + * to draw values in the i-th slot of the range. + */ + static double exponentialProbability(int i, int slots, double threshold) + { + assert(1 <= i && i <= slots); + return (exp(- threshold * (i - 1) / slots) - exp(- threshold * i / slots)) / + (1.0 - exp(- threshold)); + } + + /* print out results */ static void printResults(int ttype, int64 normal_xacts, int nclients, *************** *** 2197,2212 **** printResults(int ttype, int64 normal_xacts, int nclients, (INSTR_TIME_GET_DOUBLE(conn_total_time) / nthreads)); if (ttype == 0) ! s = "TPC-B (sort of)"; else if (ttype == 2) ! s = "Update only pgbench_accounts"; else if (ttype == 1) ! s = "SELECT only"; else s = "Custom query"; printf("transaction type: %s\n", s); printf("scaling factor: %d\n", scale); printf("query mode: %s\n", QUERYMODE[querymode]); printf("number of clients: %d\n", nclients); printf("number of threads: %d\n", nthreads); --- 2453,2521 ---- (INSTR_TIME_GET_DOUBLE(conn_total_time) / nthreads)); if (ttype == 0) ! { ! if (use_gaussian) ! s = "Gaussian distribution TPC-B (sort of)"; ! else if (use_exponential) ! s = "Exponential distribution TPC-B (sort of)"; ! else ! s = "TPC-B (sort of)"; ! } else if (ttype == 2) ! { ! if (use_gaussian) ! s = "Gaussian distribution update only pgbench_accounts"; ! else if (use_exponential) ! s = "Exponential distribution update only pgbench_accounts"; ! else ! s = "Update only pgbench_accounts"; ! } else if (ttype == 1) ! { ! if (use_gaussian) ! s = "Gaussian distribution SELECT only"; ! else if (use_exponential) ! s = "Exponential distribution SELECT only"; ! else ! s = "SELECT only"; ! } else s = "Custom query"; printf("transaction type: %s\n", s); printf("scaling factor: %d\n", scale); + + /* output in gaussian distribution benchmark */ + if (use_gaussian) + { + int i; + printf("standard deviation threshold: %.5f\n", stdev_threshold); + printf("decile percents:"); + for (i = 2; i <= 20; i = i + 2) + printf(" %.1f%%", (double) 50 * (erf (stdev_threshold * (1 - 0.1 * (i - 2)) / sqrt(2.0)) - + erf (stdev_threshold * (1 - 0.1 * i) / sqrt(2.0))) / + erf (stdev_threshold / sqrt(2.0))); + printf("\n"); + // printf("access probability of top 20%%, 10%% and 5%% records: %.5f %.5f %.5f\n", + // (double) ((erf (stdev_threshold * 0.2 / sqrt(2.0))) / (erf (stdev_threshold / sqrt(2.0)))), + // (double) ((erf (stdev_threshold * 0.1 / sqrt(2.0))) / (erf (stdev_threshold / sqrt(2.0)))), + // (double) ((erf (stdev_threshold * 0.05 / sqrt(2.0))) / (erf (stdev_threshold / sqrt(2.0))))); + } + /* output in exponential distribution benchmark */ + else if (use_exponential) + { + int i; + printf("exponential threshold: %.5f\n", exp_threshold); + printf("decile percents:"); + for (i = 1; i <= 10; i++) + printf(" %.1f%%", + 100.0 * exponentialProbability(i, 10, exp_threshold)); + printf("\n"); + printf("highest/lowest percent of the range: %.1f%% %.1f%%\n", + 100.0 * exponentialProbability(1, 100, exp_threshold), + 100.0 * exponentialProbability(100, 100, exp_threshold)); + } + printf("query mode: %s\n", QUERYMODE[querymode]); printf("number of clients: %d\n", nclients); printf("number of threads: %d\n", nthreads); *************** *** 2337,2342 **** main(int argc, char **argv) --- 2646,2653 ---- {"unlogged-tables", no_argument, &unlogged_tables, 1}, {"sampling-rate", required_argument, NULL, 4}, {"aggregate-interval", required_argument, NULL, 5}, + {"gaussian", required_argument, NULL, 6}, + {"exponential", required_argument, NULL, 7}, {"rate", required_argument, NULL, 'R'}, {NULL, 0, NULL, 0} }; *************** *** 2617,2622 **** main(int argc, char **argv) --- 2928,2952 ---- } #endif break; + case 6: + use_gaussian = true; + stdev_threshold = atof(optarg); + if(stdev_threshold < MIN_GAUSSIAN_THRESHOLD) + { + fprintf(stderr, "--gaussian=NUM must be more than %f: %f\n", + MIN_GAUSSIAN_THRESHOLD, stdev_threshold); + exit(1); + } + break; + case 7: + use_exponential = true; + exp_threshold = atof(optarg); + if(exp_threshold <= 0.0) + { + fprintf(stderr, "--exponential=NUM must be more 0.0\n"); + exit(1); + } + break; default: fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname); exit(1); *************** *** 2814,2819 **** main(int argc, char **argv) --- 3144,3171 ---- } } + /* set :stdev_threshold variable */ + if(getVariable(&state[0], "stdev_threshold") == NULL) + { + snprintf(val, sizeof(val), "%lf", stdev_threshold); + for (i = 0; i < nclients; i++) + { + if (!putVariable(&state[i], "startup", "stdev_threshold", val)) + exit(1); + } + } + + /* set :exp_threshold variable */ + if(getVariable(&state[0], "exp_threshold") == NULL) + { + snprintf(val, sizeof(val), "%lf", exp_threshold); + for (i = 0; i < nclients; i++) + { + if (!putVariable(&state[i], "startup", "exp_threshold", val)) + exit(1); + } + } + if (!is_no_vacuum) { fprintf(stderr, "starting vacuum..."); *************** *** 2839,2855 **** main(int argc, char **argv) switch (ttype) { case 0: ! sql_files[0] = process_builtin(tpc_b); num_files = 1; break; case 1: ! sql_files[0] = process_builtin(select_only); num_files = 1; break; case 2: ! sql_files[0] = process_builtin(simple_update); num_files = 1; break; --- 3191,3222 ---- switch (ttype) { case 0: ! if (use_gaussian) ! sql_files[0] = process_builtin(gaussian_tpc_b); ! else if (use_exponential) ! sql_files[0] = process_builtin(exponential_tpc_b); ! else ! sql_files[0] = process_builtin(tpc_b); num_files = 1; break; case 1: ! if (use_gaussian) ! sql_files[0] = process_builtin(gaussian_select_only); ! else if (use_exponential) ! sql_files[0] = process_builtin(exponential_select_only); ! else ! sql_files[0] = process_builtin(select_only); num_files = 1; break; case 2: ! if (use_gaussian) ! sql_files[0] = process_builtin(gaussian_simple_update); ! else if (use_exponential) ! sql_files[0] = process_builtin(exponential_simple_update); ! else ! sql_files[0] = process_builtin(simple_update); num_files = 1; break; *** a/doc/src/sgml/pgbench.sgml --- b/doc/src/sgml/pgbench.sgml *************** *** 307,312 **** pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</> --- 307,327 ---- </varlistentry> <varlistentry> + <term><option>--exponential</option><replaceable>threshold</></term> + <listitem> + <para> + Run exponential distribution pgbench test using this threshold parameter. + The threshold controls the distribution of access frequency on the + <structname>pgbench_accounts</> table. + See the <literal>\setrandom</> documentation below for details about + the impact of the threshold value. + When set, this option applies to all test variants (<option>-N</> for + skipping updates, or <option>-S</> for selects). + </para> + </listitem> + </varlistentry> + + <varlistentry> <term><option>-f</option> <replaceable>filename</></term> <term><option>--file=</option><replaceable>filename</></term> <listitem> *************** *** 320,325 **** pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</> --- 335,355 ---- </varlistentry> <varlistentry> + <term><option>--gaussian</option><replaceable>threshold</></term> + <listitem> + <para> + Run gaussian distribution pgbench test using this threshold parameter. + The threshold controls the distribution of access frequency on the + <structname>pgbench_accounts</> table. + See the <literal>\setrandom</> documentation below for details about + the impact of the threshold value. + When set, this option applies to all test variants (<option>-N</> for + skipping updates, or <option>-S</> for selects). + </para> + </listitem> + </varlistentry> + + <varlistentry> <term><option>-j</option> <replaceable>threads</></term> <term><option>--jobs=</option><replaceable>threads</></term> <listitem> *************** *** 748,755 **** pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</> <varlistentry> <term> ! <literal>\setrandom <replaceable>varname</> <replaceable>min</> <replaceable>max</></literal> ! </term> <listitem> <para> --- 778,785 ---- <varlistentry> <term> ! <literal>\setrandom <replaceable>varname</> <replaceable>min</> <replaceable>max</> [ uniform | [ { gaussian | exponential } <replaceable>threshold</> ] ]</literal> ! </term> <listitem> <para> *************** *** 761,769 **** pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</> </para> <para> Example: <programlisting> ! \setrandom aid 1 :naccounts </programlisting></para> </listitem> </varlistentry> --- 791,834 ---- </para> <para> + The default random distribution is uniform. The gaussian and exponential + options allow to change the distribution. The mandatory + <replaceable>threshold</> double value controls the actual distribution. + </para> + + <para> + With the gaussian option, the larger the <replaceable>threshold</>, + the more frequently values close to the middle of the interval are drawn, + and the less frequently values close to the <replaceable>min</> and + <replaceable>max</> bounds. + In other worlds, the larger the <replaceable>threshold</>, + the narrower the access range around the middle. + the smaller the threshold, the smoother the access pattern + distribution. The minimum threshold is 2.0 for performance. + </para> + + <para> + With the exponential option, the <replaceable>threshold</> parameter + controls the distribution by truncating an exponential distribution at + a specific value, and then projecting onto integers between the bounds. + To be precise, the <replaceable>threshold</> is so that the density of + probability of the exponential distribution at the <replaceable>max</> + cut-off value is exp(-threshold), the density at the <replaceable>min</> + value being 1. + Intuitively, the larger the threshold, the more frequently values close to + <replaceable>min</> are accessed, and the less frequently values close to + <replaceable>max</> are accessed. + A crude approximation of the distribution is that the most frequent 1% + values are drawn <replaceable>threshold</>% of the time. + The closer to 0.0 the threshold, the flatter (more uniform) the access + distribution. + The threshold value must be strictly positive with the exponential option. + </para> + + <para> Example: <programlisting> ! \setrandom aid 1 :naccounts gaussian 5.0 </programlisting></para> </listitem> </varlistentry>
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers