Re: [HACKERS] pgbench stats per script & other stuff
Indeed. The documentation is manually edited when submitting changes so as to minimize diffs, but then it does not correspond anymore to any actual output, so it is easy to do it wrong. Well, you fixed the "latency stddev" line to the sample output too, but in my trial run that line was not displayed, only the latency average. What are the command line args that supposedly produced this output? Maybe we should add it as a SGML comment, or even display it to the doc's reader. Good point. The test above shows the stats if there was -P , -L & --rate, because under these conditions the necessary data was collected, so they can be computed. Thus the output in the documentation assumes that one of these was used. I nearly always use "-P 1". Note that the documentation is not really precise, "will look similar to", so there is no commitment. If you feel like removing the stddev line from the doc because it is not there with usual options, fine with me. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
Fabien COELHO wrote: > > Hello, > > >>In doing this, I noticed that the latency output is wrong if you use -T > >>instead of -t; it always says the latency is zero because "duration" is > >>zero. I suppose it should be like in the attached instead. > > Indeed, I clearly overlooked option -t (transactions) which I never use. Makes sense. > >Patch actually attached here. > > Tested. There is a small issue because the \n is missing. > > Here is another version which just replaces duration by time_include, > as they should be pretty close, and fixes the style so that it is the same > whether the detailed stats are collected or not, as you pointed out. Thanks, that makes sense. > >>At the same time, it says "latency average: XYZ" instead of "latency > >>average = XYZ" as in printSimpleStats, which doesn't look terribly > >>important. But the line appears in the SGML docs. > > Indeed. The documentation is manually edited when submitting changes so as > to minimize diffs, but then it does not correspond anymore to any actual > output, so it is easy to do it wrong. Well, you fixed the "latency stddev" line to the sample output too, but in my trial run that line was not displayed, only the latency average. What are the command line args that supposedly produced this output? Maybe we should add it as a SGML comment, or even display it to the doc's reader. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
Hello, In doing this, I noticed that the latency output is wrong if you use -T instead of -t; it always says the latency is zero because "duration" is zero. I suppose it should be like in the attached instead. Indeed, I clearly overlooked option -t (transactions) which I never use. Patch actually attached here. Tested. There is a small issue because the \n is missing. Here is another version which just replaces duration by time_include, as they should be pretty close, and fixes the style so that it is the same whether the detailed stats are collected or not, as you pointed out. At the same time, it says "latency average: XYZ" instead of "latency average = XYZ" as in printSimpleStats, which doesn't look terribly important. But the line appears in the SGML docs. Indeed. The documentation is manually edited when submitting changes so as to minimize diffs, but then it does not correspond anymore to any actual output, so it is easy to do it wrong. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 06cd5db..297af4f 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -1253,8 +1253,8 @@ number of clients: 10 number of threads: 1 number of transactions per client: 1000 number of transactions actually processed: 1/1 -latency average = 15.844 ms -latency stddev = 2.715 ms +latency average: 15.844 ms +latency stddev: 2.715 ms tps = 618.764555 (including connections establishing) tps = 622.977698 (excluding connections establishing) script statistics: diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index 52d1223..d5380d2 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -3099,8 +3099,8 @@ printSimpleStats(char *prefix, SimpleStats *ss) double latency = ss->sum / ss->count; double stddev = sqrt(ss->sum2 / ss->count - latency * latency); - printf("%s average = %.3f ms\n", prefix, 0.001 * latency); - printf("%s stddev = %.3f ms\n", prefix, 0.001 * stddev); + printf("%s average: %.3f ms\n", prefix, 0.001 * latency); + printf("%s stddev: %.3f ms\n", prefix, 0.001 * stddev); } /* print out results */ @@ -3155,7 +3155,7 @@ printResults(TState *threads, StatsData *total, instr_time total_time, else /* only an average latency computed from the duration is available */ printf("latency average: %.3f ms\n", - 1000.0 * duration * nclients / total->cnt); + 1000.0 * time_include * nclients / total->cnt); if (throttle_delay) { -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
Alvaro Herrera wrote: > In doing this, I noticed that the latency output is wrong if you use -T > instead of -t; it always says the latency is zero because "duration" is > zero. I suppose it should be like in the attached instead. At the same > time, it says "latency average: XYZ" instead of "latency average = XYZ" > as in printSimpleStats, which doesn't look terribly important. But the > line appears in the SGML docs. Patch actually attached here. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index 52d1223..5cb5906 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -3152,10 +3152,13 @@ printResults(TState *threads, StatsData *total, instr_time total_time, if (throttle_delay || progress || latency_limit) printSimpleStats("latency", >latency); - else + else if (duration > 0) /* only an average latency computed from the duration is available */ - printf("latency average: %.3f ms\n", + printf("latency average: %.3f ms", 1000.0 * duration * nclients / total->cnt); + else + printf("latency average: %.3f ms", + 1000.0 * time_include * nclients / total->cnt); if (throttle_delay) { -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
Fabien COELHO wrote: > > >- that it does work:-) I'm not sure what happens by the script selection > > process, it should be checked carefully because it was not designed > > with allowing a zero weight, and it may depend on its/their positions. > > It may already work, but it really needs checking. > > Hmmm, it seems ok. It's not -- if you used -i, it died saying weight is zero. > >- I would suggest that a warning is shown when a weight is zero, > > something like "warning, script #%d weight is zero, will be ignored". > > includes such a warning. I didn't include this part. Pushed. In doing this, I noticed that the latency output is wrong if you use -T instead of -t; it always says the latency is zero because "duration" is zero. I suppose it should be like in the attached instead. At the same time, it says "latency average: XYZ" instead of "latency average = XYZ" as in printSimpleStats, which doesn't look terribly important. But the line appears in the SGML docs. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
- that it does work:-) I'm not sure what happens by the script selection process, it should be checked carefully because it was not designed with allowing a zero weight, and it may depend on its/their positions. It may already work, but it really needs checking. Hmmm, it seems ok. Attached is an updated patch, which: - I would suggest that a warning is shown when a weight is zero, something like "warning, script #%d weight is zero, will be ignored". includes such a warning. - the documentation should be updated:-) adds a line about 0 weight in the documentation. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index c6d1454..e31d5ee 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -698,6 +698,7 @@ pgbench options dbname Each script may be given a relative weight specified after a @ so as to change its drawing probability. The default weight is 1. + Weight 0 scripts are ignored. diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index 4196b0e..0c1a0ee 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -2953,10 +2953,10 @@ parseScriptWeight(const char *option, char **script) fprintf(stderr, "invalid weight specification: %s\n", sep); exit(1); } - if (wtmp > INT_MAX || wtmp <= 0) + if (wtmp > INT_MAX || wtmp < 0) { fprintf(stderr, - "weight specification out of range (1 .. %u): " INT64_FORMAT "\n", + "weight specification out of range (0 .. %u): " INT64_FORMAT "\n", INT_MAX, (int64) wtmp); exit(1); } @@ -2987,6 +2987,13 @@ addScript(ParsedScript script) exit(1); } + if (script.weight == 0) + { + fprintf(stderr, +"warning, script \"%s\" (%d) weight is zero, will be ignored\n", +script.desc, num_scripts); + } + sql_script[num_scripts] = script; num_scripts++; } @@ -3527,6 +3534,12 @@ main(int argc, char **argv) /* cannot overflow: weight is 32b, total_weight 64b */ total_weight += sql_script[i].weight; + if (total_weight == 0) + { + fprintf(stderr, "total weight for scripts (-b and -f options) cannot be zero\n"); + exit(1); + } + /* show per script stats if several scripts are used */ if (num_scripts > 1) per_script_stats = true; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
Hello Jeff, So I wanted to do something like: for f in `seq 0 5 100`; do pgbench -T 180 -c8 -j8 -b tpcb-like@$f -b select-only@100 done; But, I'm not allowed to specify a weight of zero. Indeed. I did not envision such a use case, but it is quite legitimate and interesting! I would hope that the behavior would be a linear combination of the raw performance of each script, but whether it is indeed the case is not that sure. Would this be a welcome change? Speaking for myself, I would be fine with such a change, provided: - that it does work:-) I'm not sure what happens by the script selection process, it should be checked carefully because it was not designed with allowing a zero weight, and it may depend on its/their positions. It may already work, but it really needs checking. - I would suggest that a warning is shown when a weight is zero, something like "warning, script #%d weight is zero, will be ignored". - the documentation should be updated:-) -- Fabien -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
On Fri, Jul 17, 2015 at 6:50 AM, Fabienwrote: > > This patch adds per-script statistics & other improvements to pgbench > > Rationale: Josh asked for the per-script stats:-) > > Some restructuring is done so that all stats (-l --aggregate-interval > --progress --per-script-stats, latency & lag...) share the same structures > and functions to accumulate data. This limits a lot the growth of pgbench > from this patch (+17 lines). > > In passing, remove the distinction between internal and external scripts. > Pgbench just execute scripts, some of them may be internal... > > As a side effect, all scripts can be accumulated "pgbench -B -N -S -f ..." > would execute 4 scripts, 3 of which internal (tpc-b, simple-update, > select-only and another externally supplied one). > > Also add a weight option to change the probability of choosing some scripts > when several are available. I was eager to use this to do some performance testing on a series of workloads gradually transitioning from write-heavy to read-only. So I wanted to do something like: for f in `seq 0 5 100`; do pgbench -T 180 -c8 -j8 -b tpcb-like@$f -b select-only@100 done; But, I'm not allowed to specify a weight of zero. That means I have to special-case the first iteration of the "for" loop where $f is zero. I think it would be more convenient if I was allowed to specify a zero weight, and the script would just ignore that script. All I had to do to make this work is remove the check that prevents from setting the weight to zero. But then I would need to add in a check that the sum of all weights is not zero, which I have done here. We could get more complex by not adding a zero-weight script into the array of scripts at all, rather than adding it in a way where it can never be selected. But then that would complicate the parsing of the per-script stats report, when one of the scripts was no longer reported. I like this way better. Would this be a welcome change? Cheers, Jeff pgbench_zero_weight.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
On Sat, Mar 19, 2016 at 11:34 AM, Alvaro Herrerawrote: > Jeff Janes wrote: >> On Sat, Mar 19, 2016 at 8:41 AM, Alvaro Herrera >> wrote: >> > I pushed your 25, with some additional minor tweaks. I hope I didn't >> > break anything; please test. >> >> I'm now getting compiler warnings: >> >> gcc version 4.4.7 20120313 (Red Hat 4.4.7-16) (GCC) >> >> >> pgbench.c: In function 'process_builtin': >> pgbench.c:2765: warning: 'ps.stats.lag.sum2' is used uninitialized in >> this function > > Fair complaints. I suppose the following should fix them? Yes, that fixes them. Thanks, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
Jeff Janes wrote: > On Sat, Mar 19, 2016 at 8:41 AM, Alvaro Herrera >wrote: > > I pushed your 25, with some additional minor tweaks. I hope I didn't > > break anything; please test. > > I'm now getting compiler warnings: > > gcc version 4.4.7 20120313 (Red Hat 4.4.7-16) (GCC) > > > pgbench.c: In function 'process_builtin': > pgbench.c:2765: warning: 'ps.stats.lag.sum2' is used uninitialized in > this function Fair complaints. I suppose the following should fix them? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index ecabff0..4606fb0 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -2649,11 +2649,23 @@ read_line_from_file(FILE *fd) } /* + * Initialize a ParsedScript + */ +static void +initParsedScript(ParsedScript *ps, const char *desc, int alloc_num, int weight) +{ + ps->commands = (Command **) pg_malloc(sizeof(Command *) * alloc_num); + ps->desc = desc; + ps->weight = weight; + initStats(>stats, 0.0); +} + +/* * Given a file name, read it and return its ParsedScript representation. "-" * means to read stdin. */ static ParsedScript -process_file(char *filename) +process_file(char *filename, int weight) { #define COMMANDS_ALLOC_NUM 128 ParsedScript ps; @@ -2673,8 +2685,7 @@ process_file(char *filename) } alloc_num = COMMANDS_ALLOC_NUM; - ps.commands = (Command **) pg_malloc(sizeof(Command *) * alloc_num); - ps.desc = filename; + initParsedScript(, filename, alloc_num, weight); lineno = 0; index = 0; @@ -2710,7 +2721,7 @@ process_file(char *filename) /* Parse the given builtin script and return the parsed representation */ static ParsedScript -process_builtin(BuiltinScript *bi) +process_builtin(BuiltinScript *bi, int weight) { int lineno, index; @@ -2720,8 +2731,7 @@ process_builtin(BuiltinScript *bi) ParsedScript ps; alloc_num = COMMANDS_ALLOC_NUM; - ps.desc = bi->desc; - ps.commands = (Command **) pg_malloc(sizeof(Command *) * alloc_num); + initParsedScript(, bi->desc, alloc_num, weight); lineno = 0; index = 0; @@ -2860,7 +2870,7 @@ parseScriptWeight(const char *option, char **script) /* append a script to the list of scripts to process */ static void -addScript(ParsedScript script, int weight) +addScript(ParsedScript script) { if (script.commands == NULL || script.commands[0] == NULL) { @@ -2875,8 +2885,6 @@ addScript(ParsedScript script, int weight) } sql_script[num_scripts] = script; - sql_script[num_scripts].weight = weight; - initStats(_script[num_scripts].stats, 0.0); num_scripts++; } @@ -3251,24 +3259,24 @@ main(int argc, char **argv) } weight = parseScriptWeight(optarg, ); -addScript(process_builtin(findBuiltin(script)), weight); +addScript(process_builtin(findBuiltin(script), weight)); benchmarking_option_set = true; internal_script_used = true; break; case 'S': -addScript(process_builtin(findBuiltin("select-only")), 1); +addScript(process_builtin(findBuiltin("select-only"), 1)); benchmarking_option_set = true; internal_script_used = true; break; case 'N': -addScript(process_builtin(findBuiltin("simple-update")), 1); +addScript(process_builtin(findBuiltin("simple-update"), 1)); benchmarking_option_set = true; internal_script_used = true; break; case 'f': weight = parseScriptWeight(optarg, ); -addScript(process_file(script), weight); +addScript(process_file(script, weight)); benchmarking_option_set = true; break; case 'D': @@ -3406,7 +3414,7 @@ main(int argc, char **argv) /* set default script if none */ if (num_scripts == 0 && !is_init_mode) { - addScript(process_builtin(findBuiltin("tpcb-like")), 1); + addScript(process_builtin(findBuiltin("tpcb-like"), 1)); benchmarking_option_set = true; internal_script_used = true; } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
Hello Álvaro, I pushed your 25, with some additional minor tweaks. I hope I didn't break anything; please test. I've made a few tests and all looks well. I guess the build farm will say if it does not like it. Thanks, -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
I pushed your 25, with some additional minor tweaks. I hope I didn't break anything; please test. Fabien COELHO wrote: > >I don't "prefer" memory leaks -- I prefer interfaces that make sense. > > C is not designed to return two things, and if it is what is needed it looks > awkward whatever is done. The static variable trick is dirty, but it is the > minimal fuss solution, IMO. So we are only trading awkward code against > awkward code. That's true. > I have very little time available, so I'm trying to minimize the effort. > I've tried "argue my point with committers", but it has proven very > ineffective. I've switched to "do whatever is asked if it still works", but > it is not very effective either. I understand. Sometimes arguing is better, if you can convince the other person, but sometimes the other person disagrees with you or they are just not listening. I don't have any useful advice on what to do, but frequently resigning to do a stupid thing because somebody suggested it leads to bad decisions. > >In fact, with ParsedScript I don't think we need to give a name to the > >anon struct used for builtin scripts. > > It is useful that it has a name so that find_builtin can return it. So it is. I have kept it, but I used the name BuiltinScript rather than script_t. > Version v25 results a script which is then passed as an argument, so it > avoid the dynamic allocation & later free. Maybe it is better. I had to cut > short the error handling if a file does not exists, though, and it passes a > struct by value. Passing structs by value should work fine, and I don't care much about the case that a file doesn't exist. > >No need for N_BUILTIN; we can use lengthof(builtin_script) instead. > > Indeed. "lengthof" does not seem to be standard... ok, it is a macro in some > header file. I really wanted to avoid an ugly sizeof divide hack, but as it > is hidden elsewhere this is fine. Right. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
Hello Alvaro, If somebody specifies thousands of -f switches, they will waste a few bytes with each, but I'm hardly concerned about a few dozen kilobytes there ... Ok, so you prefer a memory leak. I hate it on principle. I don't "prefer" memory leaks -- I prefer interfaces that make sense. C is not designed to return two things, and if it is what is needed it looks awkward whatever is done. The static variable trick is dirty, but it is the minimal fuss solution, IMO. So we are only trading awkward code against awkward code. Speaking of which, I don't think the arrangement in your patch really does. I know I suggested it, Yep:-) but now that I look again, it turns out I chose badly and you implemented a bad idea, so can we go back and fix it, please? Yep. I have very little time available, so I'm trying to minimize the effort. I've tried "argue my point with committers", but it has proven very ineffective. I've switched to "do whatever is asked if it still works", but it is not very effective either. What I now think should really happen is that the current sql_scripts array, currently under an anonymous struct, should be a typedef, say ParsedScript, Why not. and get a new member for the weight; Hm... it already contains "weight". process_file and process_builtin return a ParsedScript. The weight and Command ** should not be part of script_t at all. Sure. In fact, with ParsedScript I don't think we need to give a name to the anon struct used for builtin scripts. It is useful that it has a name so that find_builtin can return it. Rename the current sql_scripts.name to "desc", to mirror what is actually put in there from the builtin array struct. Make addScript receive a ParsedScript and weight, fill in the weight into the struct, and put it to the array after sanity-checking. (I'm OK with keeping "name" instead of renaming to "desc", if that change becomes too invasive.) See attached a v24 & v25. The awkwardness in v24 is that functions allocate a struct which is freed afterwards, really just to return two data. Whether it is better or worst than a static is really a matter of taste. Version v25 results a script which is then passed as an argument, so it avoid the dynamic allocation & later free. Maybe it is better. I had to cut short the error handling if a file does not exists, though, and it passes a struct by value. Feel free to pick whichever you like most. No need for N_BUILTIN; we can use lengthof(builtin_script) instead. Indeed. "lengthof" does not seem to be standard... ok, it is a macro in some header file. I really wanted to avoid an ugly sizeof divide hack, but as it is hidden elsewhere this is fine. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index cc80b3f..dd3fb1d 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -262,11 +262,13 @@ pgbench options dbname - -b scriptname - --builtin scriptname + -b scriptname[@weight] + --builtin=scriptname[@weight] Add the specified builtin script to the list of executed scripts. +An optional integer weight after @ allows to adjust the +probability of drawing the script. If not specified, it is set to 1. Available builtin scripts are: tpcb-like, simple-update and select-only. Unambiguous prefixes of builtin names are accepted. @@ -322,12 +324,14 @@ pgbench options dbname - -f filename - --file=filename + -f filename[@weight] + --file=filename[@weight] Add a transaction script read from filename to the list of executed scripts. +An optional integer weight after @ allows to adjust the +probability of drawing the test. See below for details. @@ -687,9 +691,13 @@ pgbench options dbname What is the Transaction Actually Performed in pgbench? - Pgbench executes test scripts chosen randomly from a specified list. + pgbench executes test scripts chosen randomly + from a specified list. They include built-in scripts with -b and user-provided custom scripts with -f. + Each script may be given a relative weight specified after a + @ so as to change its drawing probability. + The default weight is 1. @@ -1194,12 +1202,11 @@ number of clients: 10 number of threads: 1 number of transactions per client: 1000 number of transactions actually processed: 1/1 +latency average = 15.844 ms +latency stddev = 2.715 ms tps = 618.764555 (including connections establishing) tps = 622.977698 (excluding connections establishing) -SQL script 1: builtin: TPC-B (sort of) - - 1 transactions (100.0% of total, tps = 618.764555) - - latency average = 15.844 ms - - latency stddev = 2.715 ms +script statistics: - statement latencies in milliseconds: 0.004386\set
Re: [HACKERS] pgbench stats per script & other stuff
Fabien COELHO wrote: > >If somebody specifies thousands of -f switches, they will waste a few > >bytes with each, but I'm hardly concerned about a few dozen kilobytes > >there ... > > Ok, so you prefer a memory leak. I hate it on principle. I don't "prefer" memory leaks -- I prefer interfaces that make sense. Speaking of which, I don't think the arrangement in your patch really does. I know I suggested it, but now that I look again, it turns out I chose badly and you implemented a bad idea, so can we go back and fix it, please? What I now think should really happen is that the current sql_scripts array, currently under an anonymous struct, should be a typedef, say ParsedScript, and get a new member for the weight; process_file and process_builtin return a ParsedScript. The weight and Command ** should not be part of script_t at all. In fact, with ParsedScript I don't think we need to give a name to the anon struct used for builtin scripts. Rename the current sql_scripts.name to "desc", to mirror what is actually put in there from the builtin array struct. Make addScript receive a ParsedScript and weight, fill in the weight into the struct, and put it to the array after sanity-checking. (I'm OK with keeping "name" instead of renaming to "desc", if that change becomes too invasive.) No need for N_BUILTIN; we can use lengthof(builtin_script) instead. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
On 3/4/16 1:53 PM, Fabien COELHO wrote: >>> That is why the "fs" variable in process_file is declared "static", >>> and why >>> I wrote "some hidden awkwarness". >>> >>> I did want to avoid a malloc because then who would free the struct? >>> addScript cannot to it systematically because builtins are static. Or it >>> would have to create an on purpose struct, but I then that would be more >>> awkwarness, and malloc/free to pass arguments between functions is not >>> efficient nor very elegant. >>> >>> So the "static" option looked like the simplest & most elegant version. >> >> Surely that trick breaks if you have more than one -f switch, no? Oh, I >> see what you're doing: you only use the command list, which is >> allocated, so it doesn't matter that the rest of the struct changes >> later. > > The two fields that matter (desc and commands) are really copied into > sql_scripts, so what stays in the is overriden if used another time. > >> I'm not concerned about freeing the struct; what's the problem with it >> surviving until the program terminates? > > It is not referenced anywhere so it is a memory leak. > >> If somebody specifies thousands of -f switches, they will waste a few >> bytes with each, but I'm hardly concerned about a few dozen kilobytes >> there ... > > Ok, so you prefer a memory leak. I hate it on principle. > > Here is a v23 with a memory leak anyway. Álvaro, it looks like you've been both reviewer and committer on this work for some time. The latest patch seems to address you final concern. Can I mark it "ready for committer"? -- -David da...@pgmasters.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
That is why the "fs" variable in process_file is declared "static", and why I wrote "some hidden awkwarness". I did want to avoid a malloc because then who would free the struct? addScript cannot to it systematically because builtins are static. Or it would have to create an on purpose struct, but I then that would be more awkwarness, and malloc/free to pass arguments between functions is not efficient nor very elegant. So the "static" option looked like the simplest & most elegant version. Surely that trick breaks if you have more than one -f switch, no? Oh, I see what you're doing: you only use the command list, which is allocated, so it doesn't matter that the rest of the struct changes later. The two fields that matter (desc and commands) are really copied into sql_scripts, so what stays in the is overriden if used another time. I'm not concerned about freeing the struct; what's the problem with it surviving until the program terminates? It is not referenced anywhere so it is a memory leak. If somebody specifies thousands of -f switches, they will waste a few bytes with each, but I'm hardly concerned about a few dozen kilobytes there ... Ok, so you prefer a memory leak. I hate it on principle. Here is a v23 with a memory leak anyway. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index cc80b3f..dd3fb1d 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -262,11 +262,13 @@ pgbench options dbname - -b scriptname - --builtin scriptname + -b scriptname[@weight] + --builtin=scriptname[@weight] Add the specified builtin script to the list of executed scripts. +An optional integer weight after @ allows to adjust the +probability of drawing the script. If not specified, it is set to 1. Available builtin scripts are: tpcb-like, simple-update and select-only. Unambiguous prefixes of builtin names are accepted. @@ -322,12 +324,14 @@ pgbench options dbname - -f filename - --file=filename + -f filename[@weight] + --file=filename[@weight] Add a transaction script read from filename to the list of executed scripts. +An optional integer weight after @ allows to adjust the +probability of drawing the test. See below for details. @@ -687,9 +691,13 @@ pgbench options dbname What is the Transaction Actually Performed in pgbench? - Pgbench executes test scripts chosen randomly from a specified list. + pgbench executes test scripts chosen randomly + from a specified list. They include built-in scripts with -b and user-provided custom scripts with -f. + Each script may be given a relative weight specified after a + @ so as to change its drawing probability. + The default weight is 1. @@ -1194,12 +1202,11 @@ number of clients: 10 number of threads: 1 number of transactions per client: 1000 number of transactions actually processed: 1/1 +latency average = 15.844 ms +latency stddev = 2.715 ms tps = 618.764555 (including connections establishing) tps = 622.977698 (excluding connections establishing) -SQL script 1: builtin: TPC-B (sort of) - - 1 transactions (100.0% of total, tps = 618.764555) - - latency average = 15.844 ms - - latency stddev = 2.715 ms +script statistics: - statement latencies in milliseconds: 0.004386\set nbranches 1 * :scale 0.001343\set ntellers 10 * :scale diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index 8b0b17a..5363648 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -38,6 +38,7 @@ #include "portability/instr_time.h" #include +#include #include #include #include @@ -179,6 +180,8 @@ char *login = NULL; char *dbName; const char *progname; +#define WSEP '@'/* weight separator */ + volatile bool timer_exceeded = false; /* flag from signal handler */ /* variable definitions */ @@ -300,23 +303,27 @@ typedef struct static struct { const char *name; + int weight; Command **commands; StatsData stats; } sql_script[MAX_SCRIPTS]; /* SQL script files */ static int num_scripts; /* number of scripts in sql_script[] */ static int num_commands = 0; /* total number of Command structs */ +static int64 total_weight = 0; + static int debug = 0; /* debug flag */ /* Define builtin test scripts */ -#define N_BUILTIN 3 -static struct +typedef struct script_t { char *name; /* very short name for -b ... */ char *desc; /* short description */ - char *commands; /* actual pgbench script */ -} + char *script; /* actual pgbench script */ + Command **commands; /* temporary intermediate holder */ +} script_t; - builtin_script[] = +#define N_BUILTIN 3 +static script_t builtin_script[] = { { "tpcb-like",
Re: [HACKERS] pgbench stats per script & other stuff
Fabien COELHO wrote: > >However, this is still a bit broken -- you cannot return a stack > >variable from process_file, because the stack goes away once the > >function returns. You need to malloc it. > > That is why the "fs" variable in process_file is declared "static", and why > I wrote "some hidden awkwarness". > > I did want to avoid a malloc because then who would free the struct? > addScript cannot to it systematically because builtins are static. Or it > would have to create an on purpose struct, but I then that would be more > awkwarness, and malloc/free to pass arguments between functions is not > efficient nor very elegant. > > So the "static" option looked like the simplest & most elegant version. Surely that trick breaks if you have more than one -f switch, no? Oh, I see what you're doing: you only use the command list, which is allocated, so it doesn't matter that the rest of the struct changes later. That seems rather nasty to me -- I'd avoid that. I'm not concerned about freeing the struct; what's the problem with it surviving until the program terminates? If somebody specifies thousands of -f switches, they will waste a few bytes with each, but I'm hardly concerned about a few dozen kilobytes there ... -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
*-21.patch does what you suggested above, some hidden awkwardness but much less that the previous one. Yeah, I think this is much nicer, don't you agree? Yep, I said "less awkwarness than previous", a pretty contrived way to say better:-) However, this is still a bit broken -- you cannot return a stack variable from process_file, because the stack goes away once the function returns. You need to malloc it. That is why the "fs" variable in process_file is declared "static", and why I wrote "some hidden awkwarness". I did want to avoid a malloc because then who would free the struct? addScript cannot to it systematically because builtins are static. Or it would have to create an on purpose struct, but I then that would be more awkwarness, and malloc/free to pass arguments between functions is not efficient nor very elegant. So the "static" option looked like the simplest & most elegant version. Also, you forgot to update the comments in process_file, process_builtin, etc. Indeed. v22 attached with better comments. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index cc80b3f..dd3fb1d 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -262,11 +262,13 @@ pgbench options dbname - -b scriptname - --builtin scriptname + -b scriptname[@weight] + --builtin=scriptname[@weight] Add the specified builtin script to the list of executed scripts. +An optional integer weight after @ allows to adjust the +probability of drawing the script. If not specified, it is set to 1. Available builtin scripts are: tpcb-like, simple-update and select-only. Unambiguous prefixes of builtin names are accepted. @@ -322,12 +324,14 @@ pgbench options dbname - -f filename - --file=filename + -f filename[@weight] + --file=filename[@weight] Add a transaction script read from filename to the list of executed scripts. +An optional integer weight after @ allows to adjust the +probability of drawing the test. See below for details. @@ -687,9 +691,13 @@ pgbench options dbname What is the Transaction Actually Performed in pgbench? - Pgbench executes test scripts chosen randomly from a specified list. + pgbench executes test scripts chosen randomly + from a specified list. They include built-in scripts with -b and user-provided custom scripts with -f. + Each script may be given a relative weight specified after a + @ so as to change its drawing probability. + The default weight is 1. @@ -1194,12 +1202,11 @@ number of clients: 10 number of threads: 1 number of transactions per client: 1000 number of transactions actually processed: 1/1 +latency average = 15.844 ms +latency stddev = 2.715 ms tps = 618.764555 (including connections establishing) tps = 622.977698 (excluding connections establishing) -SQL script 1: builtin: TPC-B (sort of) - - 1 transactions (100.0% of total, tps = 618.764555) - - latency average = 15.844 ms - - latency stddev = 2.715 ms +script statistics: - statement latencies in milliseconds: 0.004386\set nbranches 1 * :scale 0.001343\set ntellers 10 * :scale diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index 8b0b17a..d7af86e 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -38,6 +38,7 @@ #include "portability/instr_time.h" #include +#include #include #include #include @@ -179,6 +180,8 @@ char *login = NULL; char *dbName; const char *progname; +#define WSEP '@'/* weight separator */ + volatile bool timer_exceeded = false; /* flag from signal handler */ /* variable definitions */ @@ -300,23 +303,27 @@ typedef struct static struct { const char *name; + int weight; Command **commands; StatsData stats; } sql_script[MAX_SCRIPTS]; /* SQL script files */ static int num_scripts; /* number of scripts in sql_script[] */ static int num_commands = 0; /* total number of Command structs */ +static int64 total_weight = 0; + static int debug = 0; /* debug flag */ /* Define builtin test scripts */ -#define N_BUILTIN 3 -static struct +typedef struct script_t { char *name; /* very short name for -b ... */ char *desc; /* short description */ - char *commands; /* actual pgbench script */ -} + char *script; /* actual pgbench script */ + Command **commands; /* temporary intermediate holder */ +} script_t; - builtin_script[] = +#define N_BUILTIN 3 +static script_t builtin_script[] = { { "tpcb-like", @@ -334,7 +341,8 @@ static struct "UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;\n" "UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;\n" "INSERT INTO
Re: [HACKERS] pgbench stats per script & other stuff
Fabien COELHO wrote: Hi, > *-21.patch does what you suggested above, some hidden awkwardness > but much less that the previous one. Yeah, I think this is much nicer, don't you agree? However, this is still a bit broken -- you cannot return a stack variable from process_file, because the stack goes away once the function returns. You need to malloc it. Also, you forgot to update the comments in process_file, process_builtin, etc. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
Hello Alvaro, I looked at 19.d and I think the design has gotten pretty convoluted. I think we could simplify with the following changes: struct script_t gets a new member, of type Command **, which is initially null. function process_builtin receives the complete script_t (not individual memebers of it) constructs the Command ** array and puts it in script_t's new member; return value is the same script_t struct it got (except it's now augmented with the Command **array). function process_file constructs a new script_t from the string list, creates its Command **array just like process_builtin and returns the constructed struct. function addScript receives script_t instead of individual members of it, and does the appropriate thing. Why not. Here are two versions: *-20.patch is the initial rebased version *-21.patch does what you suggested above, some hidden awkwardness but much less that the previous one. -- Fabiendiff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index cc80b3f..dd3fb1d 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -262,11 +262,13 @@ pgbench options dbname - -b scriptname - --builtin scriptname + -b scriptname[@weight] + --builtin=scriptname[@weight] Add the specified builtin script to the list of executed scripts. +An optional integer weight after @ allows to adjust the +probability of drawing the script. If not specified, it is set to 1. Available builtin scripts are: tpcb-like, simple-update and select-only. Unambiguous prefixes of builtin names are accepted. @@ -322,12 +324,14 @@ pgbench options dbname - -f filename - --file=filename + -f filename[@weight] + --file=filename[@weight] Add a transaction script read from filename to the list of executed scripts. +An optional integer weight after @ allows to adjust the +probability of drawing the test. See below for details. @@ -687,9 +691,13 @@ pgbench options dbname What is the Transaction Actually Performed in pgbench? - Pgbench executes test scripts chosen randomly from a specified list. + pgbench executes test scripts chosen randomly + from a specified list. They include built-in scripts with -b and user-provided custom scripts with -f. + Each script may be given a relative weight specified after a + @ so as to change its drawing probability. + The default weight is 1. @@ -1194,12 +1202,11 @@ number of clients: 10 number of threads: 1 number of transactions per client: 1000 number of transactions actually processed: 1/1 +latency average = 15.844 ms +latency stddev = 2.715 ms tps = 618.764555 (including connections establishing) tps = 622.977698 (excluding connections establishing) -SQL script 1: builtin: TPC-B (sort of) - - 1 transactions (100.0% of total, tps = 618.764555) - - latency average = 15.844 ms - - latency stddev = 2.715 ms +script statistics: - statement latencies in milliseconds: 0.004386\set nbranches 1 * :scale 0.001343\set ntellers 10 * :scale diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index 8b0b17a..c131681 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -38,6 +38,7 @@ #include "portability/instr_time.h" #include +#include #include #include #include @@ -179,6 +180,8 @@ char *login = NULL; char *dbName; const char *progname; +#define WSEP '@'/* weight separator */ + volatile bool timer_exceeded = false; /* flag from signal handler */ /* variable definitions */ @@ -300,23 +303,26 @@ typedef struct static struct { const char *name; + int weight; Command **commands; StatsData stats; } sql_script[MAX_SCRIPTS]; /* SQL script files */ static int num_scripts; /* number of scripts in sql_script[] */ static int num_commands = 0; /* total number of Command structs */ +static int64 total_weight = 0; + static int debug = 0; /* debug flag */ /* Define builtin test scripts */ -#define N_BUILTIN 3 -static struct +typedef struct script_t { char *name; /* very short name for -b ... */ char *desc; /* short description */ char *commands; /* actual pgbench script */ -} +} script_t; - builtin_script[] = +#define N_BUILTIN 3 +static script_t builtin_script[] = { { "tpcb-like", @@ -392,9 +398,9 @@ usage(void) " --tablespace=TABLESPACE create tables in the specified tablespace\n" " --unlogged-tablescreate tables as unlogged tables\n" "\nOptions to select what to run:\n" - " -b, --builtin=NAME add buitin script (use \"-b list\" to display\n" - " available scripts)\n" - " -f, --file=FILENAME add transaction script from FILENAME\n" + " -b,
Re: [HACKERS] pgbench stats per script & other stuff
Michael Paquier wrote: > On Tue, Feb 9, 2016 at 4:22 AM, Fabien COELHOwrote: > > Hmmm, I type them and I'm not so good with a keyboard, so "se" is better > > than: > > > > "selct-onlyect-only". > > I can understand that feeling. Pushed 19-e, thanks. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
I looked at 19.d and I think the design has gotten pretty convoluted. I think we could simplify with the following changes: struct script_t gets a new member, of type Command **, which is initially null. function process_builtin receives the complete script_t (not individual memebers of it) constructs the Command ** array and puts it in script_t's new member; return value is the same script_t struct it got (except it's now augmented with the Command **array). function process_file constructs a new script_t from the string list, creates its Command **array just like process_builtin and returns the constructed struct. function addScript receives script_t instead of individual members of it, and does the appropriate thing. Alternatively, we could have a different struct that's defined to carry only the Command ** array (not the command string array) and is returned by both process_builtin and process_file. Perhaps we could also put the script weight in there. With this arrangement we don't need to typedef script_t at all and we can just keep it as an anonymous struct as today. This is what I tried to describe earlier, but obviously I wasn't clear enough. Thanks, -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
On Tue, Feb 9, 2016 at 4:22 AM, Fabien COELHOwrote: >> + /* compute total_weight */ >> + for (i = 0; i < num_scripts; i++) >> + { >> + total_weight += sql_script[i].weight; >> + >> + /* detect overflow... */ >> If let as int64, you may want to remove this overflow check, or keep >> it as int32. > > > I'd rather keep int64, and remove the check. OK, and you did so. Thanks. >>> [JSON/YAML] >>> If you want something else, it would help to provide a sample of what you >>> expect. >> >> You could do that with an additional option here as well: >> --output-format=normal|yamljson. The tastes of each user is different. > > I think that json/yaml-ifying pgbench output is beyond the object of this > patch, so should be kept out? Yeah, that's just a free idea that this set of patches does not need to address. If someone thinks that's worth it, feel free to submit a patch, perhaps we could add a TODO item on the wiki. Regarding the output generated by your patch, I think that's fine. Perhaps Alvaro has other thoughts on the matter. I don't know this part. >>> Find attached a 18-d which addresses these concerns, and a actualized >>> 18-e >>> for the prefix. >> >> >> (I could live without that personally) > > Hmmm, I type them and I'm not so good with a keyboard, so "se" is better > than: > > "selct-onlyect-only". I can understand that feeling. >> -/* return builtin script "name", or fail if not found */ >> +/* return commands for selected builtin script, if unambiguous */ >> static script_t * >> findBuiltin(const char *name) >> This comment needs a refresh. This does not return a set of commands, >> but the script itself. > > Indeed. > > Attached 19-d and 19-e. +/* return builtin script "name", or fail if not found */ builtin does not sound like correct English to me, but built-in is. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
Hi Michaël, Attached 19-d and 19-e. +/* return builtin script "name", or fail if not found */ builtin does not sound like correct English to me, but built-in is. I notice that "man bash" uses "builtin" extensively, so I think it is okay like that, but I would be fine as well with "built-in". I suggest to let it as is unless some native speaker really requires "built-in", in which case there would be many places to update, so that would be another orthographic-oriented patch:-) -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
On Fri, Feb 5, 2016 at 12:53 AM, Fabien COELHOwrote: >> Something is wrong with patch d. I noticed two things, >> 1. the total_weight stuff can overflow, > > It can generate an error on overflow by checking the total_weight while it > is being computed. I've switched total_weight to int64 so it is now really > impossible to overflow with the 32 bit int_max limit on weight. + /* compute total_weight */ + for (i = 0; i < num_scripts; i++) + { + total_weight += sql_script[i].weight; + + /* detect overflow... */ + if (total_weight < 0) + { + fprintf(stderr, "script weight overflow at script %d\n", i+1); + exit(1); + } + } If let as int64, you may want to remove this overflow check, or keep it as int32. >> 2. the chooseScript stuff is broken, or something. > > Sorry, probably a <=/< error. I think I've fixed it and I've simplified the > code a little bit. + w = getrand(thread, 0, total_weight - 1); + do + { + w -= sql_script[i++].weight; + } while (w >= 0); + + return i - 1; This portion looks fine to me. >> Another thing is that the "transaction type" output really deserves some >> more work. I think "multiple scripts" really doesn't cut it; we should >> have some YAML-like as in the latency reports, which lists all scripts >> in use and their weights. > > For me the current output is clear for the reader, which does not mean that > it cannot be improve, but how is rather a matter of taste. > > I've tried to improve it further, see attached patch. > > If you want something else, it would help to provide a sample of what you > expect. You could do that with an additional option here as well: --output-format=normal|yamljson. The tastes of each user is different. >> Attached is my version of the patch. While you're messing with it, it'd >> be nice if you added comments on top of your recently added functions >> such as findBuiltin, process_builtin, chooseScript. > Why not. const char *name; + int weight; Command **commands; - StatsData stats; + StatsData stats; Noise here? > Find attached a 18-d which addresses these concerns, and a actualized 18-e > for the prefix. (I could live without that personally) -/* return builtin script "name", or fail if not found */ +/* return commands for selected builtin script, if unambiguous */ static script_t * findBuiltin(const char *name) This comment needs a refresh. This does not return a set of commands, but the script itself. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
Hello Michaël, + /* compute total_weight */ + for (i = 0; i < num_scripts; i++) + { + total_weight += sql_script[i].weight; + + /* detect overflow... */ If let as int64, you may want to remove this overflow check, or keep it as int32. I'd rather keep int64, and remove the check. [JSON/YAML] If you want something else, it would help to provide a sample of what you expect. You could do that with an additional option here as well: --output-format=normal|yamljson. The tastes of each user is different. I think that json/yaml-ifying pgbench output is beyond the object of this patch, so should be kept out? const char *name; + int weight; Command **commands; - StatsData stats; + StatsData stats; Noise here? Indeed. Find attached a 18-d which addresses these concerns, and a actualized 18-e for the prefix. (I could live without that personally) Hmmm, I type them and I'm not so good with a keyboard, so "se" is better than: "selct-onlyect-only". -/* return builtin script "name", or fail if not found */ +/* return commands for selected builtin script, if unambiguous */ static script_t * findBuiltin(const char *name) This comment needs a refresh. This does not return a set of commands, but the script itself. Indeed. Attached 19-d and 19-e. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index ade1b53..780a520 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -262,11 +262,13 @@ pgbench options dbname - -b scriptname - --builtin scriptname + -b scriptname[@weight] + --builtin=scriptname[@weight] Add the specified builtin script to the list of executed scripts. +An optional integer weight after @ allows to adjust the +probability of drawing the script. If not specified, it is set to 1. Available builtin scripts are: tpcb-like, simple-update and select-only. With special name list, show the list of builtin scripts @@ -321,12 +323,14 @@ pgbench options dbname - -f filename - --file=filename + -f filename[@weight] + --file=filename[@weight] Add a transaction script read from filename to the list of executed scripts. +An optional integer weight after @ allows to adjust the +probability of drawing the test. See below for details. @@ -686,9 +690,13 @@ pgbench options dbname What is the Transaction Actually Performed in pgbench? - Pgbench executes test scripts chosen randomly from a specified list. + pgbench executes test scripts chosen randomly + from a specified list. They include built-in scripts with -b and user-provided custom scripts with -f. + Each script may be given a relative weight specified after a + @ so as to change its drawing probability. + The default weight is 1. @@ -1135,12 +1143,11 @@ number of clients: 10 number of threads: 1 number of transactions per client: 1000 number of transactions actually processed: 1/1 +latency average = 15.844 ms +latency stddev = 2.715 ms tps = 618.764555 (including connections establishing) tps = 622.977698 (excluding connections establishing) -SQL script 1: builtin: TPC-B (sort of) - - 1 transactions (100.0% of total, tps = 618.764555) - - latency average = 15.844 ms - - latency stddev = 2.715 ms +script statistics: - statement latencies in milliseconds: 0.004386\set nbranches 1 * :scale 0.001343\set ntellers 10 * :scale diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index 7eb6a2d..a73e289 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -38,6 +38,7 @@ #include "portability/instr_time.h" #include +#include #include #include #include @@ -179,6 +180,8 @@ char *login = NULL; char *dbName; const char *progname; +#define WSEP '@'/* weight separator */ + volatile bool timer_exceeded = false; /* flag from signal handler */ /* variable definitions */ @@ -299,23 +302,26 @@ typedef struct static struct { const char *name; + int weight; Command **commands; StatsData stats; } sql_script[MAX_SCRIPTS]; /* SQL script files */ static int num_scripts; /* number of scripts in sql_script[] */ static int num_commands = 0; /* total number of Command structs */ +static int64 total_weight = 0; + static int debug = 0; /* debug flag */ /* Define builtin test scripts */ -#define N_BUILTIN 3 -static struct +typedef struct script_t { char *name; /* very short name for -b ... */ char *desc; /* short description */ char *commands; /* actual pgbench script */ -} +} script_t; - builtin_script[] = +#define N_BUILTIN 3 +static script_t builtin_script[] = { { "tpcb-like", @@
Re: [HACKERS] pgbench stats per script & other stuff
I closed this one as "committed", since we pushed a bunch of parts. Please submit the two remaining ones to the next commitfest. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
Hello Alvaro, Something is wrong with patch d. I noticed two things, 1. the total_weight stuff can overflow, It can generate an error on overflow by checking the total_weight while it is being computed. I've switched total_weight to int64 so it is now really impossible to overflow with the 32 bit int_max limit on weight. 2. the chooseScript stuff is broken, or something. Sorry, probably a <=/< error. I think I've fixed it and I've simplified the code a little bit. Another thing is that the "transaction type" output really deserves some more work. I think "multiple scripts" really doesn't cut it; we should have some YAML-like as in the latency reports, which lists all scripts in use and their weights. For me the current output is clear for the reader, which does not mean that it cannot be improve, but how is rather a matter of taste. I've tried to improve it further, see attached patch. If you want something else, it would help to provide a sample of what you expect. Also, while I have your attention regarding accumulated "technical debt", please have a look at the "desc" argument used in addScript etc. It's pretty ridiculous currently. Maybe findBuiltin / process_builtin / process_file should return a struct containing Command ** and the "desc" string, rather than passing desc as a separate argument. Ok, it can return a pointer to the builtin script. Attached is my version of the patch. While you're messing with it, it'd be nice if you added comments on top of your recently added functions such as findBuiltin, process_builtin, chooseScript. Why not. Find attached a 18-d which addresses these concerns, and a actualized 18-e for the prefix. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index ade1b53..780a520 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -262,11 +262,13 @@ pgbench options dbname - -b scriptname - --builtin scriptname + -b scriptname[@weight] + --builtin=scriptname[@weight] Add the specified builtin script to the list of executed scripts. +An optional integer weight after @ allows to adjust the +probability of drawing the script. If not specified, it is set to 1. Available builtin scripts are: tpcb-like, simple-update and select-only. With special name list, show the list of builtin scripts @@ -321,12 +323,14 @@ pgbench options dbname - -f filename - --file=filename + -f filename[@weight] + --file=filename[@weight] Add a transaction script read from filename to the list of executed scripts. +An optional integer weight after @ allows to adjust the +probability of drawing the test. See below for details. @@ -686,9 +690,13 @@ pgbench options dbname What is the Transaction Actually Performed in pgbench? - Pgbench executes test scripts chosen randomly from a specified list. + pgbench executes test scripts chosen randomly + from a specified list. They include built-in scripts with -b and user-provided custom scripts with -f. + Each script may be given a relative weight specified after a + @ so as to change its drawing probability. + The default weight is 1. @@ -1135,12 +1143,11 @@ number of clients: 10 number of threads: 1 number of transactions per client: 1000 number of transactions actually processed: 1/1 +latency average = 15.844 ms +latency stddev = 2.715 ms tps = 618.764555 (including connections establishing) tps = 622.977698 (excluding connections establishing) -SQL script 1: builtin: TPC-B (sort of) - - 1 transactions (100.0% of total, tps = 618.764555) - - latency average = 15.844 ms - - latency stddev = 2.715 ms +script statistics: - statement latencies in milliseconds: 0.004386\set nbranches 1 * :scale 0.001343\set ntellers 10 * :scale diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index 7eb6a2d..da4f05c 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -38,6 +38,7 @@ #include "portability/instr_time.h" #include +#include #include #include #include @@ -179,6 +180,8 @@ char *login = NULL; char *dbName; const char *progname; +#define WSEP '@'/* weight separator */ + volatile bool timer_exceeded = false; /* flag from signal handler */ /* variable definitions */ @@ -299,23 +302,26 @@ typedef struct static struct { const char *name; + int weight; Command **commands; - StatsData stats; + StatsData stats; } sql_script[MAX_SCRIPTS]; /* SQL script files */ static int num_scripts; /* number of scripts in sql_script[] */ static int num_commands = 0; /* total number of Command structs */ +static int64 total_weight = 0; + static int debug = 0; /* debug flag */ /* Define builtin test
Re: [HACKERS] pgbench stats per script & other stuff
Hello Michaël, Two rebase attached. - 15-e: prefix selection for -b - if (strncmp(builtin_script[i].name, name, - strlen(builtin_script[i].name)) == 0) + if (strncmp(builtin_script[i].name, name, len) == 0) I agree with Alvaro here: this should remain unchanged. It seems to be a rebase mistake. I do not understand. I tested it and it works as expected. If I put the above strlen instead the suffix detection does not work: fabien@sto:bin/pgbench> git diff diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index 783cbf9..0c33012 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -2684,7 +2684,7 @@ findBuiltin(const char *name, char **desc) for (i = 0; i < N_BUILTIN; i++) { - if (strncmp(builtin_script[i].name, name, len) == 0) + if (strncmp(builtin_script[i].name, name, strlen(builtin_script[i].name)) == 0) { *desc = builtin_script[i].desc; commands = builtin_script[i].commands; ./pgbench -b t no builtin script found for name "t" Available builtin scripts: tpcb-like simple-update select-only Indeed, then it can only match if the provided "name" is as long as the recorded len. The point of the "suffix" selection is to align to the short supplied string. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index ade1b53..ca3e158 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -262,11 +262,13 @@ pgbench options dbname - -b scriptname - --builtin scriptname + -b scriptname[@weight] + --builtin scriptname[@weight] Add the specified builtin script to the list of executed scripts. +An optional integer weight after @ allows to adjust the +probability of drawing the test. Available builtin scripts are: tpcb-like, simple-update and select-only. With special name list, show the list of builtin scripts @@ -321,12 +323,14 @@ pgbench options dbname - -f filename - --file=filename + -f filename[@weight] + --file=filename[@weight] Add a transaction script read from filename to the list of executed scripts. +An optional integer weight after @ allows to adjust the +probability of drawing the test. See below for details. @@ -689,6 +693,9 @@ pgbench options dbname Pgbench executes test scripts chosen randomly from a specified list. They include built-in scripts with -b and user-provided custom scripts with -f. + Each script may be given a relative weight specified after a + @ so as to change its drawing probability. + The default weight is 1. @@ -1137,7 +1144,7 @@ number of transactions per client: 1000 number of transactions actually processed: 1/1 tps = 618.764555 (including connections establishing) tps = 622.977698 (excluding connections establishing) -SQL script 1: builtin: TPC-B (sort of) +SQL script 1, weight 1: builtin: TPC-B (sort of) - 1 transactions (100.0% of total, tps = 618.764555) - latency average = 15.844 ms - latency stddev = 2.715 ms diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index 7eb6a2d..5d24768 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -179,6 +179,8 @@ char *login = NULL; char *dbName; const char *progname; +#define WSEP '@' /* weight separator */ + volatile bool timer_exceeded = false; /* flag from signal handler */ /* variable definitions */ @@ -299,11 +301,14 @@ typedef struct static struct { const char *name; + int weight; Command **commands; StatsData stats; } sql_script[MAX_SCRIPTS]; /* SQL script files */ static int num_scripts; /* number of scripts in sql_script[] */ static int num_commands = 0; /* total number of Command structs */ +static int total_weight = 0; + static int debug = 0; /* debug flag */ /* Define builtin test scripts */ @@ -389,9 +394,9 @@ usage(void) " --tablespace=TABLESPACE create tables in the specified tablespace\n" " --unlogged-tablescreate tables as unlogged tables\n" "\nOptions to select what to run:\n" - " -b, --builtin=NAME add buitin script (use \"-b list\" to display\n" - " available scripts)\n" - " -f, --file=FILENAME add transaction script from FILENAME\n" + " -b, --builtin=NAME[@W] add weighted buitin script (use \"-b list\"\n" + " to display available scripts)\n" + " -f, --file=FILENAME[@W] add weighted transaction script from FILENAME\n" " -N, --skip-some-updates skip updates of pgbench_tellers and pgbench_branches\n" "
Re: [HACKERS] pgbench stats per script & other stuff
On Fri, Jan 29, 2016 at 11:28 PM, Fabien COELHOwrote: > Here is a rebase of the 3 remaining parts: > - 15-c: per script stats > - 15-d: weighted scripts > - 15-e: prefix selection for -b Regarding patch d. + /* compute total_weight */ + for (i = 0; i < num_scripts; i++) + total_weight += sql_script[i].weight; total_weight can overflow :) I don't think that's worth worrying, I am just noticing that. +The provided scriptname needs only be an unambiguous +prefix of the builtin name, hence si would be enough to +select simple-update. [...] - if (strncmp(builtin_script[i].name, name, - strlen(builtin_script[i].name)) == 0) + if (strncmp(builtin_script[i].name, name, len) == 0) I agree with Alvaro here: this should remain unchanged. It seems to be a rebase mistake. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
Hello Alvaro, Thanks for the progress! I pushed this, along with a few more tweaks, mostly adding comments and moving functions so that related things are together. I hope I didn't break anything. Looks ok. Here is a rebase of the 3 remaining parts: - 15-c: per script stats - 15-d: weighted scripts - 15-e: prefix selection for -b -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 42d0667..ade1b53 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -1138,6 +1138,9 @@ number of transactions actually processed: 1/1 tps = 618.764555 (including connections establishing) tps = 622.977698 (excluding connections establishing) SQL script 1: builtin: TPC-B (sort of) + - 1 transactions (100.0% of total, tps = 618.764555) + - latency average = 15.844 ms + - latency stddev = 2.715 ms - statement latencies in milliseconds: 0.004386\set nbranches 1 * :scale 0.001343\set ntellers 10 * :scale diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index 44da3d1..64c7a6c 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -164,6 +164,7 @@ bool use_log; /* log transaction latencies to a file */ bool use_quiet; /* quiet logging onto stderr */ int agg_interval; /* log aggregates instead of individual * transactions */ +boolper_script_stats = false; /* whether to collect stats per script */ int progress = 0; /* thread progress report every this seconds */ bool progress_timestamp = false; /* progress report with Unix time */ int nclients = 1; /* number of clients */ @@ -299,6 +300,7 @@ static struct { const char *name; Command **commands; + StatsData stats; } sql_script[MAX_SCRIPTS]; /* SQL script files */ static int num_scripts; /* number of scripts in sql_script[] */ static int num_commands = 0; /* total number of Command structs */ @@ -1326,7 +1328,7 @@ top: /* transaction finished: calculate latency and log the transaction */ if (commands[st->state + 1] == NULL) { - if (progress || throttle_delay || latency_limit || logfile) + if (progress || throttle_delay || latency_limit || per_script_stats || logfile) processXactStats(thread, st, , false, logfile, agg); else thread->stats.cnt++; @@ -1419,7 +1421,7 @@ top: } /* Record transaction start time under logging, progress or throttling */ - if ((logfile || progress || throttle_delay || latency_limit) && st->state == 0) + if ((logfile || progress || throttle_delay || latency_limit || per_script_stats) && st->state == 0) { INSTR_TIME_SET_CURRENT(st->txn_begin); @@ -1872,6 +1874,9 @@ processXactStats(TState *thread, CState *st, instr_time *now, if (use_log) doLog(thread, st, logfile, now, agg, skipped, latency, lag); + + if (per_script_stats) /* mutex? hmmm... these are only statistics */ + accumStats(& sql_script[st->use_file].stats, skipped, latency, lag); } @@ -2661,6 +2666,7 @@ addScript(const char *name, Command **commands) sql_script[num_scripts].name = name; sql_script[num_scripts].commands = commands; + initStats(& sql_script[num_scripts].stats, 0.0); num_scripts++; } @@ -2744,22 +2750,40 @@ printResults(TState *threads, StatsData *total, instr_time total_time, printf("tps = %f (including connections establishing)\n", tps_include); printf("tps = %f (excluding connections establishing)\n", tps_exclude); - /* Report per-command latencies */ - if (is_latencies) + /* Report per-script stats */ + if (per_script_stats) { int i; for (i = 0; i < num_scripts; i++) { - Command **commands; + printf("SQL script %d: %s\n" + " - "INT64_FORMAT" transactions (%.1f%% of total, tps = %f)\n", + i + 1, sql_script[i].name, + sql_script[i].stats.cnt, + 100.0 * sql_script[i].stats.cnt / total->cnt, + sql_script[i].stats.cnt / time_include); - printf("SQL script %d: %s\n", i + 1, sql_script[i].name); - printf(" - statement latencies in milliseconds:\n"); + if (latency_limit) +printf(" - number of transactions skipped: "INT64_FORMAT" (%.3f%%)\n", + sql_script[i].stats.skipped, + 100.0 * sql_script[i].stats.skipped / + (sql_script[i].stats.skipped + sql_script[i].stats.cnt)); - for (commands = sql_script[i].commands; *commands != NULL; commands++) -printf(" %11.3f %s\n", - 1000.0 * (*commands)->stats.sum / (*commands)->stats.count, - (*commands)->line); + printSimpleStats(" - latency", & sql_script[i].stats.latency); + + /* Report per-command latencies */ + if (is_latencies) + { +Command ** com; + +printf(" - statement latencies in milliseconds:\n"); + +for (com = sql_script[i].commands; *com != NULL; com++) + printf(" %11.3f %s\n", + 1000.0 * (*com)->stats.sum / (*com)->stats.count, + (*com)->line); + } } } } @@ -2945,6 +2969,7 @@ main(int argc, char **argv)
Re: [HACKERS] pgbench stats per script & other stuff
Fabien COELHO wrote: > The answer is essentially yes, the field is needed for the "aggregated" mode > where this specific behavior is used. OK, thanks, that looks better to me. Can you now appreciate why I asked for split patches? If I had to go over the big patch I probably wouldn't have been able to read through each to make sense of it. I pushed this, along with a few more tweaks, mostly adding comments and moving functions so that related things are together. I hope I didn't break anything. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
Hello again, Obviously this would work. I did not think the special case was worth the extra argument. This one has some oddity too, because the second argument is ignored depending on the third. Do as you feel. Actually my question was whether keeping the original start_time was the intended design. Sorry I misunderstood the question. The answer is essentially yes, the field is needed for the "aggregated" mode where this specific behavior is used. However, after some look at the code I think that it is possible to do without. I also spotted an small issue under low tps where the last aggregation was not shown. With the attached version these problems have been removed, no conditional initialization. There is also a small diff with the version you sent. -- Fabien.diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index d5f242c..b3fe994 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -166,10 +166,8 @@ int agg_interval; /* log aggregates instead of individual * transactions */ int progress = 0; /* thread progress report every this seconds */ bool progress_timestamp = false; /* progress report with Unix time */ -int progress_nclients = 0; /* number of clients for progress - * report */ -int progress_nthreads = 0; /* number of threads for progress - * report */ +int nclients = 1; /* number of clients */ +int nthreads = 1; /* number of threads */ bool is_connect; /* establish connection for each transaction */ bool is_latencies; /* report per-command latencies */ int main_pid; /* main process id used in log filename */ @@ -193,6 +191,35 @@ typedef struct #define SHELL_COMMAND_SIZE 256 /* maximum size allowed for shell command */ /* + * Simple data structure to keep stats about something. + * + * XXX probably the first value should be kept and used as an offset for + * better numerical stability... + */ +typedef struct +{ + int64 count; /* how many values were encountered */ + double min; /* the minimum seen */ + double max; /* the maximum seen */ + double sum; /* sum of values */ + double sum2; /* sum of squared values */ +} SimpleStats; + +/* + * Data structure to hold various statistics, used for interval statistics as + * well as file statistics. + */ +typedef struct +{ + long start_time; /* interval start time, for aggregates */ + int64 cnt; /* number of transactions */ + int64 skipped; /* number of transactions skipped under --rate + * and --latency-limit */ + SimpleStats latency; + SimpleStats lag; +} StatsData; + +/* * Connection state */ typedef struct @@ -213,10 +240,8 @@ typedef struct bool prepared[MAX_SCRIPTS]; /* whether client prepared the script */ /* per client collected stats */ - int cnt; /* xacts count */ + int64 cnt; /* transaction count */ int ecnt; /* error count */ - int64 txn_latencies; /* cumulated latencies */ - int64 txn_sqlats; /* cumulated square latencies */ } CState; /* @@ -228,19 +253,14 @@ typedef struct pthread_t thread; /* thread handle */ CState *state; /* array of CState */ int nstate; /* length of state[] */ - instr_time start_time; /* thread start time */ - instr_time *exec_elapsed; /* time spent executing cmds (per Command) */ - int *exec_count; /* number of cmd executions (per Command) */ unsigned short random_state[3]; /* separate randomness for each thread */ int64 throttle_trigger; /* previous/next throttling (us) */ /* per thread collected stats */ + instr_time start_time; /* thread start time */ instr_time conn_time; - int64 throttle_lag; /* total transaction lag behind throttling */ - int64 throttle_lag_max; /* max transaction lag */ - int64 throttle_latency_skipped; /* lagging transactions - * skipped */ - int64 latency_late; /* late transactions */ + StatsData stats; + int64 latency_late; /* executed but late transactions */ } TState; #define INVALID_THREAD ((pthread_t) 0) @@ -272,33 +292,14 @@ typedef struct char *argv[MAX_ARGS]; /* command word list */ int cols[MAX_ARGS]; /* corresponding column starting from 1 */ PgBenchExpr *expr; /* parsed expression */ + SimpleStats stats; /* time spent in this command */ } Command; -typedef struct -{ - - long start_time; /* when does the interval start */ - int cnt; /* number of transactions */ - int skipped; /* number of transactions skipped under --rate - * and --latency-limit */ - - double min_latency; /* min/max latencies */ - double max_latency; - double sum_latency; /* sum(latency), sum(latency^2) - for - * estimates */ - double sum2_latency; - - double min_lag; - double max_lag; - double sum_lag; /* sum(lag) */ - double sum2_lag; /* sum(lag*lag) */ -} AggVals; - static struct { const char *name; - Command **commands; -} sql_script[MAX_SCRIPTS]; /* SQL script files */ + Command **commands; +}
Re: [HACKERS] pgbench stats per script & other stuff
Fabien COELHO wrote: > >It seems a bit funny to have the start_time not be reset when 0.0 is > >passed, which is almost all the callers. Using a float as a boolean > >looks pretty odd; is that kosher? Maybe it'd be a good idea to have a > >separate boolean flag instead? > Obviously this would work. I did not think the special case was worth the > extra argument. This one has some oddity too, because the second argument is > ignored depending on the third. Do as you feel. Actually my question was whether keeping the original start_time was the intended design. I think some places are okay with keeping the original value, but the ones in addScript, the per-thread loop in main(), and the global one also in main() should all be getting a 0.0 instead of leaving the value uninitialized. (I did turn the arguments around so that the bool is second and the float is third. Thanks for the suggestion.) -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
Hello again, Here's part b rebased, pgindented and with some minor additional tweaks (mostly function commands and the function renames I mentioned). Patch looks ok to me, various tests where ok as well. Still concerned about the unlocked stat accums. See my arguments in other mail. I can add a lock if this is a blocker, but I think that it is actually better without, because of quantum: the measuring process should avoid affecting the measured data, and locking is not cheap. I haven't tried to rebase the other ones yet, they need manual conflict fixes. Find attached 14-c/d/e rebased patches. About e, for some obscure reason I failed in my initial attempt at inserting the misplaced options in their rightfull position in the option list. Sorry for the noise. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 42d0667..ade1b53 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -1138,6 +1138,9 @@ number of transactions actually processed: 1/1 tps = 618.764555 (including connections establishing) tps = 622.977698 (excluding connections establishing) SQL script 1: builtin: TPC-B (sort of) + - 1 transactions (100.0% of total, tps = 618.764555) + - latency average = 15.844 ms + - latency stddev = 2.715 ms - statement latencies in milliseconds: 0.004386\set nbranches 1 * :scale 0.001343\set ntellers 10 * :scale diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index 305c319..5594d1c 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -164,6 +164,7 @@ bool use_log; /* log transaction latencies to a file */ bool use_quiet; /* quiet logging onto stderr */ int agg_interval; /* log aggregates instead of individual * transactions */ +boolper_script_stats = false; /* whether to collect stats per script */ int progress = 0; /* thread progress report every this seconds */ bool progress_timestamp = false; /* progress report with Unix time */ int nclients = 1; /* number of clients */ @@ -299,6 +300,7 @@ static struct { const char *name; Command **commands; + StatsData stats; } sql_script[MAX_SCRIPTS]; /* SQL script files */ static int num_scripts; /* number of scripts in sql_script[] */ static int num_commands = 0; /* total number of Command structs */ @@ -1308,7 +1310,7 @@ top: /* transaction finished: calculate latency and log the transaction */ if (commands[st->state + 1] == NULL) { - if (progress || throttle_delay || latency_limit || logfile) + if (progress || throttle_delay || latency_limit || per_script_stats || logfile) doTxStats(thread, st, , false, logfile, agg); else thread->stats.cnt++; @@ -1401,7 +1403,7 @@ top: } /* Record transaction start time under logging, progress or throttling */ - if ((logfile || progress || throttle_delay || latency_limit) && st->state == 0) + if ((logfile || progress || throttle_delay || latency_limit || per_script_stats) && st->state == 0) { INSTR_TIME_SET_CURRENT(st->txn_begin); @@ -1889,6 +1891,9 @@ doTxStats(TState *thread, CState *st, instr_time *now, if (use_log) doLog(thread, st, logfile, now, agg, skipped, latency, lag); + + if (per_script_stats) /* mutex? hmmm... these are only statistics */ + doStats(& sql_script[st->use_file].stats, skipped, latency, lag); } @@ -2678,6 +2683,7 @@ addScript(const char *name, Command **commands) sql_script[num_scripts].name = name; sql_script[num_scripts].commands = commands; + initStats(& sql_script[num_scripts].stats, 0.0); num_scripts++; } @@ -2761,22 +2767,40 @@ printResults(TState *threads, StatsData *total, instr_time total_time, printf("tps = %f (including connections establishing)\n", tps_include); printf("tps = %f (excluding connections establishing)\n", tps_exclude); - /* Report per-command latencies */ - if (is_latencies) + /* Report per-script stats */ + if (per_script_stats) { int i; for (i = 0; i < num_scripts; i++) { - Command **commands; + printf("SQL script %d: %s\n" + " - "INT64_FORMAT" transactions (%.1f%% of total, tps = %f)\n", + i+1, sql_script[i].name, + sql_script[i].stats.cnt, + 100.0 * sql_script[i].stats.cnt / total->cnt, + sql_script[i].stats.cnt / time_include); - printf("SQL script %d: %s\n", i + 1, sql_script[i].name); - printf(" - statement latencies in milliseconds:\n"); + if (latency_limit) +printf(" - number of transactions skipped: "INT64_FORMAT" (%.3f%%)\n", + sql_script[i].stats.skipped, + 100.0 * sql_script[i].stats.skipped / + (sql_script[i].stats.skipped + sql_script[i].stats.cnt)); - for (commands = sql_script[i].commands; *commands != NULL; commands++) -printf(" %11.3f %s\n", - 1000.0 * (*commands)->stats.sum / (*commands)->stats.count, - (*commands)->line); + printSimpleStats(" - latency", &
Re: [HACKERS] pgbench stats per script & other stuff
Hello again, If you want to implement real non-ambiguous-prefix code (i.e. have "se" for "select-only", but reject "s" as ambiguous) be my guest. I'm fine with filtering out ambiguous cases (i.e. just the "s" case). Attached a small patch for that. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 42d0667..124e70d 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -269,6 +269,9 @@ pgbench options dbname Add the specified builtin script to the list of executed scripts. Available builtin scripts are: tpcb-like, simple-update and select-only. +The provided scriptname needs only be an unambiguous +prefix of the builtin name, hence si would be enough to +select simple-update. With special name list, show the list of builtin scripts and exit immediately. diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index d5f242c..6350948 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -2649,22 +2649,32 @@ listAvailableScripts(void) fprintf(stderr, "\n"); } +/* return commands for selected builtin script, if unambiguous */ static char * findBuiltin(const char *name, char **desc) { - int i; + int i, found = 0, len = strlen(name); + char *commands = NULL; for (i = 0; i < N_BUILTIN; i++) { - if (strncmp(builtin_script[i].name, name, - strlen(builtin_script[i].name)) == 0) + if (strncmp(builtin_script[i].name, name, len) == 0) { *desc = builtin_script[i].desc; - return builtin_script[i].commands; + commands = builtin_script[i].commands; + found++; } } - fprintf(stderr, "no builtin script found for name \"%s\"\n", name); + if (found == 1) + return commands; + + /* error cases */ + if (found == 0) + fprintf(stderr, "no builtin script found for name \"%s\"\n", name); + else /* found > 1 */ + fprintf(stderr, +"%d builtin scripts found for prefix \"%s\"\n", found, name); listAvailableScripts(); exit(1); } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
Hello Alvaro, I'm not really sure about the fact that we operate on those Stats structs without locking. I see upthread you convinced Michael that it was okay, but is it really? How severe is the damage if two threads happen to collide? For stats shared among threads, when it occurs one data about one transaction is not counted. On the risk side: the collision probability is pretty low because the time to update a value is a "few" cycles, and the time to execute a transaction is typically in ms: I think under 1/10,000,000 data could be lost. On the advantageous side: locking costs significant time thus would impact performance, I think that the measured performance loss because the occasional transaction data is not counted is lower that the performance loss due to systematically locking. So for me this is really a low risk trade-off. [...] It seems a bit funny to have the start_time not be reset when 0.0 is passed, which is almost all the callers. Using a float as a boolean looks pretty odd; is that kosher? Maybe it'd be a good idea to have a separate boolean flag instead? Something like this /* * Initialize a StatsData struct to all zeroes. Use the given * start_time only if reset_start_time, otherwise keep the original * value. */ static void initStats(StatsData *sd, double start_time, bool reset_start_time) { sd->cnt = 0; sd->skipped = 0; initSimpleStats(>latency); initSimpleStats(>lag); /* not necessarily overriden? */ if (reset_start_time) sd->start_time = start_time; } Obviously this would work. I did not think the special case was worth the extra argument. This one has some oddity too, because the second argument is ignored depending on the third. Do as you feel. I renamed a couple of your functionettes, for instance doSimpleStats to addToSimpleStats and appendSimpleStats to mergeSimpleStats. Fine with me. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
Fabien COELHO wrote: > You know how delighted I am to split patches... Yes, of course, it's the most interesting task in the world. I'm fully aware of that. FWIW I'm going to apply a preliminary commit to pgindent-clean the file before your patches, then apply each patch as pgindent-clean. Otherwise your whitespace style was getting too much on my nerves. > b) refactor statistics collections (per thread, per command, per whatever) >so as to use the same structure everywhere, reducing the CLOC by 115. >this enables the next small patch which can reuse the new functions. I'm not really sure about the fact that we operate on those Stats structs without locking. I see upthread you convinced Michael that it was okay, but is it really? How severe is the damage if two threads happen to collide? Why is this function defined like this? /* * Initialize a StatsData struct to all zeroes, but the given * start_time, except that if it's exactly zero don't change it. */ static void initStats(StatsData *sd, double start_time) { sd->cnt = 0; sd->skipped = 0; initSimpleStats(>latency); initSimpleStats(>lag); /* not necessarily overriden? */ if (start_time) sd->start_time = start_time; } It seems a bit funny to have the start_time not be reset when 0.0 is passed, which is almost all the callers. Using a float as a boolean looks pretty odd; is that kosher? Maybe it'd be a good idea to have a separate boolean flag instead? Something like this /* * Initialize a StatsData struct to all zeroes. Use the given * start_time only if reset_start_time, otherwise keep the original * value. */ static void initStats(StatsData *sd, double start_time, bool reset_start_time) { sd->cnt = 0; sd->skipped = 0; initSimpleStats(>latency); initSimpleStats(>lag); /* not necessarily overriden? */ if (reset_start_time) sd->start_time = start_time; } I renamed a couple of your functionettes, for instance doSimpleStats to addToSimpleStats and appendSimpleStats to mergeSimpleStats. Haven't looked at patches c or d yet. I'm tempted to thrown patch e in with the initial pgindent run. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
Fabien COELHO wrote: > a) add -b option for cumulating builtins and rework internal script >management so that builtin and external scripts are managed the >same way. I tweaked this a bit. I found a bug in threadRun: it was reading the commands first, and setting st->use_file later. This led to the wrong commands being read. Some other less interesting changes: * made chooseScript have the logic to react to single existing script; no need to inject ternary operators in each caller to check for that condition. * Added a debug line every time a script is chosen, + if (debug) + fprintf(stderr, "client %d executing script \"%s\"\n", st->id, + sql_script[st->use_file].name); (I'd have liked to have chooseScript itself do it, but it doesn't have the script name handy. Maybe this indicates that the data structures are slightly wrong.) * Added a separate routine to list available scripts; originally that was duplicated in "-b list" and when -b got an invalid script name. * In usage(), I split out the options to select a script instead of mixing them within "Benchmarking options"; also changed wording of parenthical comment, no longer carrying the full list of scripts (a choice which also omitted "-b list" itself): + "\nOptions to select what to run:\n" + " -b, --builtin=NAME add buitin script (use \"-b list\" to display\n" + " available scripts)\n" + " -f, --file=FILENAME add transaction script from FILENAME\n" + " -N, --skip-some-updates skip updates of pgbench_tellers and pgbench_branches\n" + " (same as \"-b simple-update\")\n" + " -S, --select-onlyperform SELECT-only transactions\n" + " (same as \"-b select-only\")\n" I couldn't find a better heading to use there, so that'll have to do unless someone has a better idea. Some other trivial changes. Patch attached. I plan to push this as soon as I'm able. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 541d17b..fdd3331 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -261,6 +261,23 @@ pgbench options dbname benchmarking arguments: + + -b scriptname + --builtin scriptname + + +Add the specified builtin script to the list of executed scripts. +Available builtin scripts are: tpcb-like, +simple-update and select-only. +The provided scriptname needs only to be a prefix +of the builtin name, hence simp would be enough to select +simple-update. +With special name list, show the list of builtin scripts +and exit immediately. + + + + -c clients @@ -307,14 +324,13 @@ pgbench options dbname - -f filename - --file=filename + -f filename + --file=filename -Read transaction script from filename. +Add a transaction script read from filename to +the list of executed scripts. See below for details. --N, -S, and -f -are mutually exclusive. @@ -404,10 +420,8 @@ pgbench options dbname --skip-some-updates -Do not update pgbench_tellers and -pgbench_branches. -This will avoid update contention on these tables, but -it makes the test case even less like TPC-B. +Run builtin simple-update script. +Shorthand for -b simple-update. @@ -512,9 +526,9 @@ pgbench options dbname Report the specified scale factor in pgbench's output. With the built-in tests, this is not necessary; the correct scale factor will be detected by counting the number of -rows in the pgbench_branches table. However, when testing -custom benchmarks (-f option), the scale factor -will be reported as 1 unless this option is used. +rows in the pgbench_branches table. +However, when testing only custom benchmarks (-f option), +the scale factor will be reported as 1 unless this option is used. @@ -524,7 +538,8 @@ pgbench options dbname --select-only -Perform select-only transactions instead of TPC-B-like test. +Run built-in select-only script. +Shorthand for -b select-only. @@ -674,7 +689,17 @@ pgbench options dbname What is the Transaction Actually Performed in pgbench? - The default transaction script issues seven commands per transaction: + Pgbench executes test scripts chosen randomly from a specified list. + They include built-in scripts
Re: [HACKERS] pgbench stats per script & other stuff
Fabien COELHO wrote: > a) add -b option for cumulating builtins and rework internal script >management so that builtin and external scripts are managed the >same way. I'm uncomfortable with the prefix-matching aspect of -b. It makes "-b s" ambiguous -- whether it stands for select-only or simple-update is merely a matter of what comes earlier in the table, which doesn't seem reasonable to me. I'd rather have a real way to reject ambiguous cases, or simply accept only complete spellings. This is the guilty party: > +static char * > +find_builtin(const char *name, char **desc) > +{ > + int len = strlen(name), i; > + > + for (i = 0; i < N_BUILTIN; i++) > + { > + if (strncmp(builtin_script[i].name, name, len) == 0) > + { > + *desc = builtin_script[i].desc; > + return builtin_script[i].script; > + } > + } I'm going to change this to use strlen(builtin_script[i].name) instead of "len" here. If you want to implement real non-ambiguous-prefix code (i.e. have "se" for "select-only", but reject "s" as ambiguous) be my guest. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
Hello Alvaro, I'm looking at this part of your patch and I think it's far too big to be a simple refactoring. Would you split it up please? You know how delighted I am to split patches... Here is a 5 part ordered patch serie: a) add -b option for cumulating builtins and rework internal script management so that builtin and external scripts are managed the same way. b) refactor statistics collections (per thread, per command, per whatever) so as to use the same structure everywhere, reducing the CLOC by 115. this enables the next small patch which can reuse the new functions. c) add per-script statistics... because Josh asked:-) d) add optional weight to control the relative frequency of scripts. e) minor code cleanup : use bool instead of int where appropriate put together struct fields when they belong together move 2 options at their right position in the list This patch serie conflicts slightly with the "add functions to pgbench" patch which is marked as ready in the CF. The first to make it will mean some conflict resolution for the other. Maybe I would prefer this one serie to go first, if I had any say... -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 541d17b..fdd3331 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -261,6 +261,23 @@ pgbench options dbname benchmarking arguments: + + -b scriptname + --builtin scriptname + + +Add the specified builtin script to the list of executed scripts. +Available builtin scripts are: tpcb-like, +simple-update and select-only. +The provided scriptname needs only to be a prefix +of the builtin name, hence simp would be enough to select +simple-update. +With special name list, show the list of builtin scripts +and exit immediately. + + + + -c clients @@ -307,14 +324,13 @@ pgbench options dbname - -f filename - --file=filename + -f filename + --file=filename -Read transaction script from filename. +Add a transaction script read from filename to +the list of executed scripts. See below for details. --N, -S, and -f -are mutually exclusive. @@ -404,10 +420,8 @@ pgbench options dbname --skip-some-updates -Do not update pgbench_tellers and -pgbench_branches. -This will avoid update contention on these tables, but -it makes the test case even less like TPC-B. +Run builtin simple-update script. +Shorthand for -b simple-update. @@ -512,9 +526,9 @@ pgbench options dbname Report the specified scale factor in pgbench's output. With the built-in tests, this is not necessary; the correct scale factor will be detected by counting the number of -rows in the pgbench_branches table. However, when testing -custom benchmarks (-f option), the scale factor -will be reported as 1 unless this option is used. +rows in the pgbench_branches table. +However, when testing only custom benchmarks (-f option), +the scale factor will be reported as 1 unless this option is used. @@ -524,7 +538,8 @@ pgbench options dbname --select-only -Perform select-only transactions instead of TPC-B-like test. +Run built-in select-only script. +Shorthand for -b select-only. @@ -674,7 +689,17 @@ pgbench options dbname What is the Transaction Actually Performed in pgbench? - The default transaction script issues seven commands per transaction: + Pgbench executes test scripts chosen randomly from a specified list. + They include built-in scripts with -b and + user-provided custom scripts with -f. + + + + The default builtin transaction script (also invoked with -b tpcb-like) + issues seven commands per transaction over randomly chosen aid, + tid, bid and balance. + The scenario is inspired by the TPC-B benchmark, but is not actually TPC-B, + hence the name. @@ -688,9 +713,15 @@ pgbench options dbname - If you specify -N, steps 4 and 5 aren't included in the - transaction. If you specify -S, only the SELECT is - issued. + If you select the simple-update builtin (also -N), + steps 4 and 5 aren't included in the transaction. + This will avoid update contention on these tables, but + it makes the test case even less like TPC-B. + + + + If you select the select-only builtin (also -S), + only the SELECT is issued. @@ -702,10 +733,7 @@ pgbench options dbname benchmark scenarios by replacing the default transaction script (described above) with a transaction script read from a file
Re: [HACKERS] pgbench stats per script & other stuff
Hello Alvaro, Here is a two part v12, which: part a (refactoring of scripts and their stats): - fix option checks (-i alone) - s/repleacable/replaceable/ in doc - keep small description in doc and help for -S & -N - fix 2 comments for pg style - show builtin list if not found I'm looking at this part of your patch and I think it's far too big to be a simple refactoring. Would you split it up please? I think the StatsData / SimpleStat addition should be one patch; then there's the -b changes. Then there may (or may not) be a bunch of other minor cleanups, not sure. I'm willing to commit these patches if I can easily review what they do, which I cannot with the current state. Hmmm. ISTM that other people already reviewed it. I can try to separate (again) some stuff, but there will be no miracle. The overdue refactoring is because pgbench collects statistics at various levels, and each time this is done in a different way. Cleaning this requires to touch the stuff in many places, which means a "big" patch, although ISTM a straightforward one, but this already the case with this one. Please pgindent; make sure to add /*--- here to avoid pgindent mangling the comment: Ok. part b (weight) - check that the weight is an int This part looks okay to me. Minor nitpick, + int i = 0, w = 0, wc = (int) getrand(thread, 0, total_weight - 1); should be three lines, not one. Ok. Also the @W part in the --help output should be in brackets, as FILE[@W], right? Why not. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
Fabien COELHO wrote: > > Here is a two part v12, which: > > part a (refactoring of scripts and their stats): > - fix option checks (-i alone) > - s/repleacable/replaceable/ in doc > - keep small description in doc and help for -S & -N > - fix 2 comments for pg style > - show builtin list if not found I'm looking at this part of your patch and I think it's far too big to be a simple refactoring. Would you split it up please? I think the StatsData / SimpleStat addition should be one patch; then there's the -b changes. Then there may (or may not) be a bunch of other minor cleanups, not sure. I'm willing to commit these patches if I can easily review what they do, which I cannot with the current state. Please pgindent; make sure to add /*--- here to avoid pgindent mangling the comment: if (pg_strcasecmp(my_commands->argv[0], "setrandom") == 0) { /* * parsing: > part b (weight) > - check that the weight is an int This part looks okay to me. Minor nitpick, + int i = 0, w = 0, wc = (int) getrand(thread, 0, total_weight - 1); should be three lines, not one. Also the @W part in the --help output should be in brackets, as FILE[@W], right? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
Hello Michaël, And then I also had a look at src/port/snprintf.c, where things get actually weird when no transactions are run for a script (emulated with 2 scripts, one with @1 and the second with @1): - 0 transactions (0.0% of total, tps = 0.00) - latency average = -1.#IO ms - latency stddev = -1.#IO ms And it seems that this is a bug in fmtfloat() because it does not handle nan values correctly. Others, any thoughts about that? It is possible to address things within your patch by using isnan() and print another message but I think that we had better patch snprintf.c if my analysis is right. FWIW, I just had a closer look at this portion and I arrived at the conclusion that sprintf implementation on Windows is just broken as it is not able to handle appropriately inf or nan as exceptions. fmtfloat@src/port/snprintf.c relies on the system's implementation of sprintf to handle those exceptions, however even directly calling sprintf results in the same weird output, inf showing up as "1.#IO" and nan as "-1.#IO". Anyone, feel free to disagree if I am missing something. I have no opinion any about M$ implementation of double prettyprinting, but I agree that "-1.#IO" looks strange. WWW seems to say that "-1.INF" and "-1.IND" are the "normal" way for windows to say infinity or not a number. Well, if someone there thought it look good, I cannot help it. Still, it would be cool to have better error message when there is no value to show up to the user, like "no latency average" or "undefined latency average". That would be more elegant, and the patches proposed still lack that. Hmmm. I do not buy that for several reasons: For --progress style reporting you want NaN or whatever, because the output could be processed further unix-style from a pipe (grep/cut/...). This is also true for the final report. I would not want to change the output organisations for some special values, I would just like to get the value whatever it is, "NaN" or "Infinity" or even "-1.IND", so that the pipe commands would work. Also, for the final report, it seems to me overkill to try to work around cases when pgbench does not run any transactions, which is basically not often, as the point is to run many transactions. Finally this behavior already exists, the patch does not change anything AFAICS, and it is not its purpose. So I would suggest to keep it that way. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
On Wed, Dec 16, 2015 at 4:09 PM, Michael Paquierwrote: > Yeah, that's actually fine. I just had a look at Windows stuff, and > things seem to be correct on this side for double: > https://msdn.microsoft.com/en-us/library/aa691373%28v=vs.71%29.aspx > And then I also had a look at src/port/snprintf.c, where things get > actually weird when no transactions are run for a script (emulated > with 2 scripts, one with @1 and the second with @1): > - 0 transactions (0.0% of total, tps = 0.00) > - latency average = -1.#IO ms > - latency stddev = -1.#IO ms > And it seems that this is a bug in fmtfloat() because it does not > handle nan values correctly. Others, any thoughts about that? > It is possible to address things within your patch by using isnan() > and print another message but I think that we had better patch > snprintf.c if my analysis is right. FWIW, I just had a closer look at this portion and I arrived at the conclusion that sprintf implementation on Windows is just broken as it is not able to handle appropriately inf or nan as exceptions. fmtfloat@src/port/snprintf.c relies on the system's implementation of sprintf to handle those exceptions, however even directly calling sprintf results in the same weird output, inf showing up as "1.#IO" and nan as "-1.#IO". Anyone, feel free to disagree if I am missing something. Still, it would be cool to have better error message when there is no value to show up to the user, like "no latency average" or "undefined latency average". That would be more elegant, and the patches proposed still lack that. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
Here is a two part v12, which: part a (refactoring of scripts and their stats): - fix option checks (-i alone) - s/repleacable/replaceable/ in doc - keep small description in doc and help for -S & -N - fix 2 comments for pg style - show builtin list if not found part b (weight) - check that the weight is an int -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 0ac40f1..62ce496 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -261,6 +261,23 @@ pgbench options dbname benchmarking arguments: + + -b scriptname + --builtin scriptname + + +Add the specified builtin script to the list of executed scripts. +Available builtin scripts are: tpcb-like, +simple-update and select-only. +The provided scriptname needs only to be a prefix +of the builtin name, hence simp would be enough to select +simple-update. +With special name list, show the list of builtin scripts +and exit immediately. + + + + -c clients @@ -307,14 +324,13 @@ pgbench options dbname - -f filename - --file=filename + -f filename + --file=filename -Read transaction script from filename. +Add a transaction script read from filename to +the list of executed scripts. See below for details. --N, -S, and -f -are mutually exclusive. @@ -404,10 +420,8 @@ pgbench options dbname --skip-some-updates -Do not update pgbench_tellers and -pgbench_branches. -This will avoid update contention on these tables, but -it makes the test case even less like TPC-B. +Run builtin simple-update script. +Shorthand for -b simple-update. @@ -512,9 +526,9 @@ pgbench options dbname Report the specified scale factor in pgbench's output. With the built-in tests, this is not necessary; the correct scale factor will be detected by counting the number of -rows in the pgbench_branches table. However, when testing -custom benchmarks (-f option), the scale factor -will be reported as 1 unless this option is used. +rows in the pgbench_branches table. +However, when testing only custom benchmarks (-f option), +the scale factor will be reported as 1 unless this option is used. @@ -524,7 +538,8 @@ pgbench options dbname --select-only -Perform select-only transactions instead of TPC-B-like test. +Run built-in select-only script. +Shorthand for -b select-only. @@ -674,7 +689,17 @@ pgbench options dbname What is the Transaction Actually Performed in pgbench? - The default transaction script issues seven commands per transaction: + Pgbench executes test scripts chosen randomly from a specified list. + They include built-in scripts with -b and + user-provided custom scripts with -f. + + + + The default builtin transaction script (also invoked with -b tpcb-like) + issues seven commands per transaction over randomly chosen aid, + tid, bid and balance. + The scenario is inspired by the TPC-B benchmark, but is not actually TPC-B, + hence the name. @@ -688,9 +713,15 @@ pgbench options dbname - If you specify -N, steps 4 and 5 aren't included in the - transaction. If you specify -S, only the SELECT is - issued. + If you select the simple-update builtin (also -N), + steps 4 and 5 aren't included in the transaction. + This will avoid update contention on these tables, but + it makes the test case even less like TPC-B. + + + + If you select the select-only builtin (also -S), + only the SELECT is issued. @@ -702,10 +733,7 @@ pgbench options dbname benchmark scenarios by replacing the default transaction script (described above) with a transaction script read from a file (-f option). In this case a transaction - counts as one execution of a script file. You can even specify - multiple scripts (multiple -f options), in which - case a random one of the scripts is chosen each time a client session - starts a new transaction. + counts as one execution of a script file. diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index f2d435b..1e0c7cc 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -164,12 +164,11 @@ bool use_log; /* log transaction latencies to a file */ bool use_quiet; /* quiet logging onto stderr */ int agg_interval; /* log aggregates instead of individual * transactions */ +boolper_script_stats = false; /* whether to collect stats per script */ int progress = 0; /* thread progress report every
Re: [HACKERS] pgbench stats per script & other stuff
It seems also that it would be a good idea to split the patch into two parts: 1) Refactor the code so as the existing test scripts are put under the same umbrella with addScript, adding at the same time the new option -b. 2) Add the weight facility and its related statistics. Sigh. The patch & documentation are probably not independent, so that would make two dependent patches, probably. I am not really saying so, it seems just that doing the refactoring (with its related docs), and then add the extension for the weight (with its docs) is more natural than doing both things at the same time. Ok. I can separate the refactoring (scripts & stats) and the weight stuff on top of the refactoring. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
"sum" is a double so count is converted to 0.0, 0.0/0.0 == NaN, hence the comment. PG code usually avoids that, and I recall static analyze tools type coverity complaining that this may lead to undefined behavior. While I agree that this would lead to NaN... Hmmm. In this case that is what is actually wanted. If there is no transaction, the tps or average latency or whatever is "NaN", I cannot help it, and IEEE 754 allow that. So in this case the tool is wrong if it complains, or at least we are right to ignore the warning. Maybe there is some special comment to say "ignore this warning on the next line" if it occurs, if this is an issue. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
On Tue, Dec 15, 2015 at 8:41 PM, Fabien COELHOwrote: >> PG code usually avoids that, and I recall static analyze tools type >> coverity complaining that this may lead to undefined behavior. While I >> agree that this would lead to NaN... > > > Hmmm. In this case that is what is actually wanted. If there is no > transaction, the tps or average latency or whatever is "NaN", I cannot help > it, and IEEE 754 allow that. So in this case the tool is wrong if it > complains, or at least we are right to ignore the warning. Maybe there is > some special comment to say "ignore this warning on the next line" if it > occurs, if this is an issue. Yeah, that's actually fine. I just had a look at Windows stuff, and things seem to be correct on this side for double: https://msdn.microsoft.com/en-us/library/aa691373%28v=vs.71%29.aspx And then I also had a look at src/port/snprintf.c, where things get actually weird when no transactions are run for a script (emulated with 2 scripts, one with @1 and the second with @1): - 0 transactions (0.0% of total, tps = 0.00) - latency average = -1.#IO ms - latency stddev = -1.#IO ms And it seems that this is a bug in fmtfloat() because it does not handle nan values correctly. Others, any thoughts about that? It is possible to address things within your patch by using isnan() and print another message but I think that we had better patch snprintf.c if my analysis is right. Oh, and actually when trying to compile the patch on Windows things are failing because int64_t is undefined :) After switching to int64 things worked better. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
On Tue, Dec 15, 2015 at 5:53 AM, Fabien COELHOwrote: > I wrote: >> Why would the likelyhood of an issue be small here? > > The time to update one stat (<< 100 cycles ?) to the time to do a > transaction with the database (typically Y ms), so the likelyhood of two > thread to update the very same stat at the same time is probably under > 1/10,000,000. Even if it occurs, then one stat is slightly false, no big > deal. So I think the potential slowdown induced by a mutex is not worth it, > so I a comment instead. OK, got it and agreed. >> + /* print NaN if no transactions where executed */ >> + double latency = ss->sum / ss->count; >> This does not look like a good idea, ss->count can be 0. > > > "sum" is a double so count is converted to 0.0, 0.0/0.0 == NaN, hence the > comment. PG code usually avoids that, and I recall static analyze tools type coverity complaining that this may lead to undefined behavior. While I agree that this would lead to NaN... >> It seems also that it would be a good idea to split the patch into two >> parts: >> 1) Refactor the code so as the existing test scripts are put under the >> same umbrella with addScript, adding at the same time the new option >> -b. >> 2) Add the weight facility and its related statistics. > > > Sigh. The patch & documentation are probably not independent, so that would > make two dependent patches, probably. I am not really saying so, it seems just that doing the refactoring (with its related docs), and then add the extension for the weight (with its docs) is more natural than doing both things at the same time. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
-Do not update pgbench_tellers and -pgbench_branches. -This will avoid update contention on these tables, but -it makes the test case even less like TPC-B. +Shorthand for -b simple-update@1. I don't think it is a good idea to remove entirely the description of what the default scenarios can do. The description would be better at the bottom in some with a list of each default test and what to expect from them. I'm trying to avoid to have the same explanation twice, otherwise someone is bound to complain. +/* data structure to hold various statistics. + * it is used for interval statistics as well as file statistics. */ Nitpick: this is not a comment formatted the Postgres-way. Indeed. This is surprisingly broken: $ pgbench -i some of the specified options cannot be used in initialization (-i) mode Hmmm. Any file name or path including "@" will fail strangely: $ pgbench -f "t...@1.sql" could not open file "test": No such file or directory empty commands for test Perhaps instead of failing we should warn the user and enforce the weight to be set at 1? Yep, I can have a look at that. $ pgbench -b foo no builtin found for "foo" This is not really helpful for the user, I think that the list of potential options should be listed as an error hint. Yep. - " -S, --select-onlyperform SELECT-only transactions\n" + " -S, --select-onlysame as \"-b select-only@1\"\n" It is good to mention that there is an equivalent, but I think that the description should be kept. The reason replace it is to keep the help message short column-wise. + /* although a mutex would make sense, the likelyhood of an issue +* is small and these are only stats which may be slightly false +*/ + doSimpleStats(& commands[st->state]->stats, + INSTR_TIME_GET_DOUBLE(now) - Why would the likelyhood of an issue be small here? The time to update one stat (<< 100 cycles ?) to the time to do a transaction with the database (typically Y ms), so the likelyhood of two thread to update the very same stat at the same time is probably under 1/10,000,000. Even if it occurs, then one stat is slightly false, no big deal. So I think the potential slowdown induced by a mutex is not worth it, so I a comment instead. + /* print NaN if no transactions where executed */ + double latency = ss->sum / ss->count; This does not look like a good idea, ss->count can be 0. "sum" is a double so count is converted to 0.0, 0.0/0.0 == NaN, hence the comment. It seems also that it would be a good idea to split the patch into two parts: 1) Refactor the code so as the existing test scripts are put under the same umbrella with addScript, adding at the same time the new option -b. 2) Add the weight facility and its related statistics. Sigh. The patch & documentation are probably not independent, so that would make two dependent patches, probably. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
On Sat, Oct 3, 2015 at 3:11 PM, Fabien COELHOwrote: > >> Here is a v10, which is a rebase because of the "--progress-timestamp" >> option addition. > > > Here is a v11, which is a rebase after some recent changes committed to > pgbench. +The provided scriptname needs only to be a prefix s/repleacable/replaceable, in short I think that documentation compilation would fail. -Do not update pgbench_tellers and -pgbench_branches. -This will avoid update contention on these tables, but -it makes the test case even less like TPC-B. +Shorthand for -b simple-update@1. I don't think it is a good idea to remove entirely the description of what the default scenarios can do. The description would be better at the bottom in some with a list of each default test and what to expect from them. +/* data structure to hold various statistics. + * it is used for interval statistics as well as file statistics. */ Nitpick: this is not a comment formatted the Postgres-way. This is surprisingly broken: $ pgbench -i some of the specified options cannot be used in initialization (-i) mode Any file name or path including "@" will fail strangely: $ pgbench -f "t...@1.sql" could not open file "test": No such file or directory empty commands for test Perhaps instead of failing we should warn the user and enforce the weight to be set at 1? $ pgbench -b foo no builtin found for "foo" This is not really helpful for the user, I think that the list of potential options should be listed as an error hint. - " -N, --skip-some-updates skip updates of pgbench_tellers and pgbench_branches\n" + " -N, --skip-some-updates same as \"-b simple-update@1\"\n" " -P, --progress=NUM show thread progress report every NUM seconds\n" " -r, --report-latencies report average latency per command\n" " -R, --rate=NUM target rate in transactions per second\n" " -s, --scale=NUM report this scale factor in output\n" - " -S, --select-onlyperform SELECT-only transactions\n" + " -S, --select-onlysame as \"-b select-only@1\"\n" It is good to mention that there is an equivalent, but I think that the description should be kept. + /* although a mutex would make sense, the likelyhood of an issue +* is small and these are only stats which may be slightly false +*/ + doSimpleStats(& commands[st->state]->stats, + INSTR_TIME_GET_DOUBLE(now) - + INSTR_TIME_GET_DOUBLE(st->stmt_begin)); Why would the likelyhood of an issue be small here? + /* print NaN if no transactions where executed */ + double latency = ss->sum / ss->count; This does not look like a good idea, ss->count can be 0. It seems also that it would be a good idea to split the patch into two parts: 1) Refactor the code so as the existing test scripts are put under the same umbrella with addScript, adding at the same time the new option -b. 2) Add the weight facility and its related statistics. The patch having some issues, I am marking it as returned with feedback. It would be nice to see a new version for next CF. Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
Here is a v10, which is a rebase because of the "--progress-timestamp" option addition. Here is a v11, which is a rebase after some recent changes committed to pgbench. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 0ac40f1..e3a90e5 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -261,6 +261,25 @@ pgbench options dbname benchmarking arguments: + + -b scriptname[@weight] + --builtin scriptname[@weight] + + +Add the specified builtin script to the list of executed scripts. +An optional integer weight after @ allows to adjust the +probability of drawing the test. +Available builtin scripts are: tpcb-like, +simple-update and select-only. +The provided scriptname needs only to be a prefix +of the builtin name, hence simp would be enough to select +simple-update. +With special name list, show the list of builtin scripts +and exit immediately. + + + + -c clients @@ -307,14 +326,15 @@ pgbench options dbname - -f filename - --file=filename + -f filename[@weight] + --file=filename[@weight] -Read transaction script from filename. +Add a transaction script read from filename to +the list of executed scripts. +An optional integer weight after @ allows to adjust the +probability of drawing the test. See below for details. --N, -S, and -f -are mutually exclusive. @@ -404,10 +424,7 @@ pgbench options dbname --skip-some-updates -Do not update pgbench_tellers and -pgbench_branches. -This will avoid update contention on these tables, but -it makes the test case even less like TPC-B. +Shorthand for -b simple-update@1. @@ -512,9 +529,9 @@ pgbench options dbname Report the specified scale factor in pgbench's output. With the built-in tests, this is not necessary; the correct scale factor will be detected by counting the number of -rows in the pgbench_branches table. However, when testing -custom benchmarks (-f option), the scale factor -will be reported as 1 unless this option is used. +rows in the pgbench_branches table. +However, when testing only custom benchmarks (-f option), +the scale factor will be reported as 1 unless this option is used. @@ -524,7 +541,7 @@ pgbench options dbname --select-only -Perform select-only transactions instead of TPC-B-like test. +Shorthand for -b select-only@1. @@ -674,7 +691,20 @@ pgbench options dbname What is the Transaction Actually Performed in pgbench? - The default transaction script issues seven commands per transaction: + Pgbench executes test scripts chosen randomly from a specified list. + They include built-in scripts with -b and + user-provided custom scripts with -f. + Each script may be given a relative weight specified after a + @ so as to change its drawing probability. + The default weight is 1. + + + + The default builtin transaction script (also invoked with -b tpcb-like) + issues seven commands per transaction over randomly chosen aid, + tid, bid and balance. + The scenario is inspired by the TPC-B benchmark, but is not actually TPC-B, + hence the name. @@ -688,9 +718,15 @@ pgbench options dbname - If you specify -N, steps 4 and 5 aren't included in the - transaction. If you specify -S, only the SELECT is - issued. + If you select the simple-update builtin (also -N), + steps 4 and 5 aren't included in the transaction. + This will avoid update contention on these tables, but + it makes the test case even less like TPC-B. + + + + If you select the select-only builtin (also -S), + only the SELECT is issued. @@ -702,10 +738,7 @@ pgbench options dbname benchmark scenarios by replacing the default transaction script (described above) with a transaction script read from a file (-f option). In this case a transaction - counts as one execution of a script file. You can even specify - multiple scripts (multiple -f options), in which - case a random one of the scripts is chosen each time a client session - starts a new transaction. + counts as one execution of a script file. diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index f2d435b..0e56ed7 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -164,12 +164,11 @@ bool use_log; /* log transaction latencies to a file */ bool use_quiet; /* quiet logging onto stderr */ int agg_interval; /* log aggregates instead of individual
Re: [HACKERS] pgbench stats per script & other stuff
On Sat, Sep 26, 2015 at 3:27 AM, Fabien COELHOwrote: > Here is a v10, which is a rebase because of the "--progress-timestamp" > option addition. I do not see it attached. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
Here is a v10, which is a rebase because of the "--progress-timestamp" option addition. I do not see it attached. Indeed. Here it is. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 0ac40f1..e3a90e5 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -261,6 +261,25 @@ pgbench options dbname benchmarking arguments: + + -b scriptname[@weight] + --builtin scriptname[@weight] + + +Add the specified builtin script to the list of executed scripts. +An optional integer weight after @ allows to adjust the +probability of drawing the test. +Available builtin scripts are: tpcb-like, +simple-update and select-only. +The provided scriptname needs only to be a prefix +of the builtin name, hence simp would be enough to select +simple-update. +With special name list, show the list of builtin scripts +and exit immediately. + + + + -c clients @@ -307,14 +326,15 @@ pgbench options dbname - -f filename - --file=filename + -f filename[@weight] + --file=filename[@weight] -Read transaction script from filename. +Add a transaction script read from filename to +the list of executed scripts. +An optional integer weight after @ allows to adjust the +probability of drawing the test. See below for details. --N, -S, and -f -are mutually exclusive. @@ -404,10 +424,7 @@ pgbench options dbname --skip-some-updates -Do not update pgbench_tellers and -pgbench_branches. -This will avoid update contention on these tables, but -it makes the test case even less like TPC-B. +Shorthand for -b simple-update@1. @@ -512,9 +529,9 @@ pgbench options dbname Report the specified scale factor in pgbench's output. With the built-in tests, this is not necessary; the correct scale factor will be detected by counting the number of -rows in the pgbench_branches table. However, when testing -custom benchmarks (-f option), the scale factor -will be reported as 1 unless this option is used. +rows in the pgbench_branches table. +However, when testing only custom benchmarks (-f option), +the scale factor will be reported as 1 unless this option is used. @@ -524,7 +541,7 @@ pgbench options dbname --select-only -Perform select-only transactions instead of TPC-B-like test. +Shorthand for -b select-only@1. @@ -674,7 +691,20 @@ pgbench options dbname What is the Transaction Actually Performed in pgbench? - The default transaction script issues seven commands per transaction: + Pgbench executes test scripts chosen randomly from a specified list. + They include built-in scripts with -b and + user-provided custom scripts with -f. + Each script may be given a relative weight specified after a + @ so as to change its drawing probability. + The default weight is 1. + + + + The default builtin transaction script (also invoked with -b tpcb-like) + issues seven commands per transaction over randomly chosen aid, + tid, bid and balance. + The scenario is inspired by the TPC-B benchmark, but is not actually TPC-B, + hence the name. @@ -688,9 +718,15 @@ pgbench options dbname - If you specify -N, steps 4 and 5 aren't included in the - transaction. If you specify -S, only the SELECT is - issued. + If you select the simple-update builtin (also -N), + steps 4 and 5 aren't included in the transaction. + This will avoid update contention on these tables, but + it makes the test case even less like TPC-B. + + + + If you select the select-only builtin (also -S), + only the SELECT is issued. @@ -702,10 +738,7 @@ pgbench options dbname benchmark scenarios by replacing the default transaction script (described above) with a transaction script read from a file (-f option). In this case a transaction - counts as one execution of a script file. You can even specify - multiple scripts (multiple -f options), in which - case a random one of the scripts is chosen each time a client session - starts a new transaction. + counts as one execution of a script file. diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index 6ae1b86..0e56ed7 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -164,12 +164,11 @@ bool use_log; /* log transaction latencies to a file */ bool use_quiet; /* quiet logging onto stderr */ int agg_interval; /* log aggregates instead of individual * transactions */ +bool
Re: [HACKERS] pgbench stats per script & other stuff
Here is a v10, which is a rebase because of the "--progress-timestamp" option addition. It also include the fix for the tps without connection computation and some minor code simplification, so it is redundant with this bug fix patch: https://commitfest.postgresql.org/7/378/ -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
On Thu, Sep 3, 2015 at 3:26 AM, Fabien COELHOwrote: > I've left out: > - removing -N/-S upward compatibility shorthands >but I will not cry if they are removed I see no particular merit to breaking backward compatibility here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
I've left out: - removing -N/-S upward compatibility shorthands but I will not cry if they are removed I see no particular merit to breaking backward compatibility here. I agree, but I would not fight for this. I think there is a good argument *NOT* to add more if new builtin scripts are added later. Currently the builtin script can be selected with "-b t" (t for tcpb-like), "-b s" (s for simple-update) and "-b se" (se for select-only). I've reused their current names for the option selector, and it takes the first matching prefix. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
On 2015-09-02 20:20:45 +0200, Fabien COELHO wrote: > >>+static SQLScript sql_script[MAX_SCRIPTS]; > >> > >>+static struct { > >>+ char *name; /* very short name for -b ...*/ > >>+ char *desc; /* short description */ > >>+ char *script; /* actual pgbench script */ > >>+} builtin_script[] > > > >Can't we put these in the same array? > > I do not understand. Right now builtins and user defined scripts are stored in different data structures. I'd rather see them in the same. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
Hello Anders, This v9 : - add "-b list" to show the list of builtins - remove the explicit --per-scripts-stats option, which is instead automatically set when several scripts are run or with per-command latencies (-r) - count scripts from 1 instead of 0 in the output I've left out: - removing -N/-S upward compatibility shorthands but I will not cry if they are removed - requiring percents instead of integer weights, because it is too constrained - your "array" remark as I did not understood it Thanks to the restructuring and sharing of stats code, the patch does not change the loc count, although a few features are added. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 2517a3a..3edd65a 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -261,6 +261,25 @@ pgbench options dbname benchmarking arguments: + + -b scriptname[@weight] + --builtin scriptname[@weight] + + +Add the specified builtin script to the list of executed scripts. +An optional integer weight after @ allows to adjust the +probability of drawing the test. +Available builtin scripts are: tpcb-like, +simple-update and select-only. +The provided scriptname needs only to be a prefix +of the builtin name, hence simp would be enough to select +simple-update. +With special name list, show the list of builtin scripts +and exit immediately. + + + + -c clients @@ -307,14 +326,15 @@ pgbench options dbname - -f filename - --file=filename + -f filename[@weight] + --file=filename[@weight] -Read transaction script from filename. +Add a transaction script read from filename to +the list of executed scripts. +An optional integer weight after @ allows to adjust the +probability of drawing the test. See below for details. --N, -S, and -f -are mutually exclusive. @@ -404,10 +424,7 @@ pgbench options dbname --skip-some-updates -Do not update pgbench_tellers and -pgbench_branches. -This will avoid update contention on these tables, but -it makes the test case even less like TPC-B. +Shorthand for -b simple-update@1. @@ -499,9 +516,9 @@ pgbench options dbname Report the specified scale factor in pgbench's output. With the built-in tests, this is not necessary; the correct scale factor will be detected by counting the number of -rows in the pgbench_branches table. However, when testing -custom benchmarks (-f option), the scale factor -will be reported as 1 unless this option is used. +rows in the pgbench_branches table. +However, when testing only custom benchmarks (-f option), +the scale factor will be reported as 1 unless this option is used. @@ -511,7 +528,7 @@ pgbench options dbname --select-only -Perform select-only transactions instead of TPC-B-like test. +Shorthand for -b select-only@1. @@ -661,7 +678,20 @@ pgbench options dbname What is the Transaction Actually Performed in pgbench? - The default transaction script issues seven commands per transaction: + Pgbench executes test scripts chosen randomly from a specified list. + They include built-in scripts with -b and + user-provided custom scripts with -f. + Each script may be given a relative weight specified after a + @ so as to change its drawing probability. + The default weight is 1. + + + + The default builtin transaction script (also invoked with -b tpcb-like) + issues seven commands per transaction over randomly chosen aid, + tid, bid and balance. + The scenario is inspired by the TPC-B benchmark, but is not actually TPC-B, + hence the name. @@ -675,9 +705,15 @@ pgbench options dbname - If you specify -N, steps 4 and 5 aren't included in the - transaction. If you specify -S, only the SELECT is - issued. + If you select the simple-update builtin (also -N), + steps 4 and 5 aren't included in the transaction. + This will avoid update contention on these tables, but + it makes the test case even less like TPC-B. + + + + If you select the select-only builtin (also -S), + only the SELECT is issued. @@ -689,10 +725,7 @@ pgbench options dbname benchmark scenarios by replacing the default transaction script (described above) with a transaction script read from a file (-f option). In this case a transaction - counts as one execution of a script file. You can even specify - multiple scripts (multiple -f options), in which - case a random one of the scripts
Re: [HACKERS] pgbench stats per script & other stuff
Right now builtins and user defined scripts are stored in different data structures. I'd rather see them in the same. They already are in the same array (sql_script) when pre-processed and executed, there is no distinction beyond initialization. The builtin_script array contains the equivalent of the external custom file (name, lines of code), so that they can be processed by process_builtin and addScript to build the SQLScript ready for execution, while for external files it relies on process_file and addScript. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
On 2015-07-30 18:03:56 +0200, Fabien COELHO wrote: > > >v6 is just a rebase after a bug fix by Andres Freund. > > > >Also a small question: The patch currently displays pgbench scripts > >starting numbering at 0. Probably a little too geek... should start at 1? > > v7 is a rebase after another small bug fix in pgbench. > > -- > Fabien. > diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml > index 2517a3a..99670d4 100644 > --- a/doc/src/sgml/ref/pgbench.sgml > +++ b/doc/src/sgml/ref/pgbench.sgml > @@ -261,6 +261,23 @@ pgbench options > dbname > benchmarking arguments: > > > + > + -b scriptname[@weight] > + --builtin scriptname[@weight] > + > + > +Add the specified builtin script to the list of executed scripts. > +An optional integer weight after @ allows to adjust the > +probability of drawing the test. > +Available builtin scripts are: tpcb-like, > +simple-update and select-only. > +The provided scriptname needs only to be a prefix > +of the builtin name, hence simp would be enough to select > +simple-update. > + > + > + Maybe add --builtin list to show them? > @@ -404,10 +422,7 @@ pgbench options > dbname >--skip-some-updates > > > -Do not update pgbench_tellers and > -pgbench_branches. > -This will avoid update contention on these tables, but > -it makes the test case even less like TPC-B. > +Shorthand for -b simple-update@1. > > > > @@ -511,7 +526,7 @@ pgbench options > dbname >--select-only > > > -Perform select-only transactions instead of TPC-B-like test. > +Shorthand for -b select-only@1. > > > I'm a bit inclined to remove these options. > > - The default transaction script issues seven commands per transaction: > + Pgbench executes test scripts chosen randomly from a specified list. > + They include built-in scripts with -b and > + user-provided custom scripts with -f. > + Each script may be given a relative weight specified after a > + @ so as to change its drawing probability. > + The default weight is 1. > + I'm wondering if percentages instead of weights would be a better idea. That'd mean you'd be forced to be more careful when adding another script (having to adjust the percentages of other scripts) but arguably that's a good thing? > +static SQLScript sql_script[MAX_SCRIPTS]; > +static struct { > + char *name; /* very short name for -b ...*/ > + char *desc; /* short description */ > + char *script; /* actual pgbench script */ > +} builtin_script[] Can't we put these in the same array? > + printf("transaction type: %s\n", > +num_scripts == 1? sql_script[0].name: "multiple scripts"); Seems like it'd be more useful to simply always list the scripts + weights here. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
On Wed, Sep 2, 2015 at 5:55 PM, Andres Freundwrote: > On 2015-09-02 14:36:51 -0400, Robert Haas wrote: >> On Wed, Sep 2, 2015 at 2:20 PM, Fabien COELHO wrote: >> >> I'm wondering if percentages instead of weights would be a better >> >> idea. That'd mean you'd be forced to be more careful when adding another >> >> script (having to adjust the percentages of other scripts) but arguably >> >> that's a good thing? >> > >> > If you use only percent, then you have to check that the total is 100, >> > probably you have to use floats, to do something when the total is not 100, >> > checking would complicate the code and test people mental calculus >> > abilities. Not sure this is a good idea:-) >> >> I agree. I don't see a reason to enforce that the total of the >> weights must be 100. > > I'm slightly worried that using weights will be a bit confusing because > adding another script will obviously reduce the frequency of already > defined scripts. But it's probably not worth worrying. That sounds like a feature to me, not a bug. I wouldn't worry. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
Hello Andres, Maybe add --builtin list to show them? Yep, easy enough. [...] +Shorthand for -b simple-update@1. +Shorthand for -b select-only@1. I'm a bit inclined to remove these options. Hm... This is really backward compatibility, and people may find reference to these in blogs or elswhere, so I think that it would make sense to be upward compatible. I would certainly be against adding any other of these options, though. + Each script may be given a relative weight specified after a + @ so as to change its drawing probability. + The default weight is 1. I'm wondering if percentages instead of weights would be a better idea. That'd mean you'd be forced to be more careful when adding another script (having to adjust the percentages of other scripts) but arguably that's a good thing? If you use only percent, then you have to check that the total is 100, probably you have to use floats, to do something when the total is not 100, checking would complicate the code and test people mental calculus abilities. Not sure this is a good idea:-) In the use case you outline, when adding a script, maybe you know that it runs "as much as" this other script, so you can pick up the same weight without bothering. Also, when testing, there is an issue when you want to remove one script for a quick test, and that would mean changing all percentages on the command line... So I would advise not to put such a constraint. +static SQLScript sql_script[MAX_SCRIPTS]; +static struct { + char *name; /* very short name for -b ...*/ + char *desc; /* short description */ + char *script; /* actual pgbench script */ +} builtin_script[] Can't we put these in the same array? I do not understand. + printf("transaction type: %s\n", + num_scripts == 1? sql_script[0].name: "multiple scripts"); Seems like it'd be more useful to simply always list the scripts + weights here. The detailed list is shown later, with the summary performance figure for each scripts, so ISTM that it would be redundant? Maybe the transaction type could be moved downwards just before said list? -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
On Wed, Sep 2, 2015 at 2:20 PM, Fabien COELHOwrote: >> I'm wondering if percentages instead of weights would be a better >> idea. That'd mean you'd be forced to be more careful when adding another >> script (having to adjust the percentages of other scripts) but arguably >> that's a good thing? > > If you use only percent, then you have to check that the total is 100, > probably you have to use floats, to do something when the total is not 100, > checking would complicate the code and test people mental calculus > abilities. Not sure this is a good idea:-) I agree. I don't see a reason to enforce that the total of the weights must be 100. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script & other stuff
On 2015-09-02 14:36:51 -0400, Robert Haas wrote: > On Wed, Sep 2, 2015 at 2:20 PM, Fabien COELHOwrote: > >> I'm wondering if percentages instead of weights would be a better > >> idea. That'd mean you'd be forced to be more careful when adding another > >> script (having to adjust the percentages of other scripts) but arguably > >> that's a good thing? > > > > If you use only percent, then you have to check that the total is 100, > > probably you have to use floats, to do something when the total is not 100, > > checking would complicate the code and test people mental calculus > > abilities. Not sure this is a good idea:-) > > I agree. I don't see a reason to enforce that the total of the > weights must be 100. I'm slightly worried that using weights will be a bit confusing because adding another script will obviously reduce the frequency of already defined scripts. But it's probably not worth worrying. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script other stuff
v7 is a rebase after another small bug fix in pgbench. v8 is a rebase after yet another small bug fix in pgbench. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 2517a3a..99670d4 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -261,6 +261,23 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ benchmarking arguments: variablelist + varlistentry + termoption-b/ replaceablescriptname[@weight]//term + termoption--builtin/ replaceablescriptname[@weight]//term + listitem + para +Add the specified builtin script to the list of executed scripts. +An optional integer weight after literal@/ allows to adjust the +probability of drawing the test. +Available builtin scripts are: literaltpcb-like/, +literalsimple-update/ and literalselect-only/. +The provided repleacablescriptname/ needs only to be a prefix +of the builtin name, hence literalsimp/ would be enough to select +literalsimple-update/. + /para + /listitem + /varlistentry + varlistentry termoption-c/option replaceableclients//term @@ -307,14 +324,15 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ /varlistentry varlistentry - termoption-f/option replaceablefilename//term - termoption--file=/optionreplaceablefilename//term + termoption-f/ replaceablefilename[@weight]//term + termoption--file=/replaceablefilename[@weight]//term listitem para -Read transaction script from replaceablefilename/. +Add a transaction script read from replaceablefilename/ to +the list of executed scripts. +An optional integer weight after literal@/ allows to adjust the +probability of drawing the test. See below for details. -option-N/option, option-S/option, and option-f/option -are mutually exclusive. /para /listitem /varlistentry @@ -404,10 +422,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ termoption--skip-some-updates/option/term listitem para -Do not update structnamepgbench_tellers/ and -structnamepgbench_branches/. -This will avoid update contention on these tables, but -it makes the test case even less like TPC-B. +Shorthand for option-b simple-update@1/. /para /listitem /varlistentry @@ -499,9 +514,9 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ Report the specified scale factor in applicationpgbench/'s output. With the built-in tests, this is not necessary; the correct scale factor will be detected by counting the number of -rows in the structnamepgbench_branches/ table. However, when testing -custom benchmarks (option-f/ option), the scale factor -will be reported as 1 unless this option is used. +rows in the structnamepgbench_branches/ table. +However, when testing only custom benchmarks (option-f/ option), +the scale factor will be reported as 1 unless this option is used. /para /listitem /varlistentry @@ -511,7 +526,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ termoption--select-only/option/term listitem para -Perform select-only transactions instead of TPC-B-like test. +Shorthand for option-b select-only@1/. /para /listitem /varlistentry @@ -567,6 +582,16 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ /varlistentry varlistentry + termoption--per-script-stats/option/term + listitem + para +Report some statistics per script run by pgbench. + /para + /listitem + /varlistentry + + + varlistentry termoption--sampling-rate=replaceablerate//option/term listitem para @@ -661,7 +686,20 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ titleWhat is the quoteTransaction/ Actually Performed in pgbench?/title para - The default transaction script issues seven commands per transaction: + Pgbench executes test scripts chosen randomly from a specified list. + They include built-in scripts with option-b/ and + user-provided custom scripts with option-f/. + Each script may be given a relative weight specified after a + literal@/ so as to change its drawing probability. + The default weight is literal1/. + /para + + para + The default builtin transaction script (also invoked with option-b tpcb-like/) + issues seven commands per transaction over randomly chosen literalaid/, + literaltid/, literalbid/ and literalbalance/. + The scenario is inspired by the TPC-B benchmark, but is not actually TPC-B, + hence the name. /para orderedlist @@ -675,9
Re: [HACKERS] pgbench stats per script other stuff
v6 is just a rebase after a bug fix by Andres Freund. Also a small question: The patch currently displays pgbench scripts starting numbering at 0. Probably a little too geek... should start at 1? v7 is a rebase after another small bug fix in pgbench. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 2517a3a..99670d4 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -261,6 +261,23 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ benchmarking arguments: variablelist + varlistentry + termoption-b/ replaceablescriptname[@weight]//term + termoption--builtin/ replaceablescriptname[@weight]//term + listitem + para +Add the specified builtin script to the list of executed scripts. +An optional integer weight after literal@/ allows to adjust the +probability of drawing the test. +Available builtin scripts are: literaltpcb-like/, +literalsimple-update/ and literalselect-only/. +The provided repleacablescriptname/ needs only to be a prefix +of the builtin name, hence literalsimp/ would be enough to select +literalsimple-update/. + /para + /listitem + /varlistentry + varlistentry termoption-c/option replaceableclients//term @@ -307,14 +324,15 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ /varlistentry varlistentry - termoption-f/option replaceablefilename//term - termoption--file=/optionreplaceablefilename//term + termoption-f/ replaceablefilename[@weight]//term + termoption--file=/replaceablefilename[@weight]//term listitem para -Read transaction script from replaceablefilename/. +Add a transaction script read from replaceablefilename/ to +the list of executed scripts. +An optional integer weight after literal@/ allows to adjust the +probability of drawing the test. See below for details. -option-N/option, option-S/option, and option-f/option -are mutually exclusive. /para /listitem /varlistentry @@ -404,10 +422,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ termoption--skip-some-updates/option/term listitem para -Do not update structnamepgbench_tellers/ and -structnamepgbench_branches/. -This will avoid update contention on these tables, but -it makes the test case even less like TPC-B. +Shorthand for option-b simple-update@1/. /para /listitem /varlistentry @@ -499,9 +514,9 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ Report the specified scale factor in applicationpgbench/'s output. With the built-in tests, this is not necessary; the correct scale factor will be detected by counting the number of -rows in the structnamepgbench_branches/ table. However, when testing -custom benchmarks (option-f/ option), the scale factor -will be reported as 1 unless this option is used. +rows in the structnamepgbench_branches/ table. +However, when testing only custom benchmarks (option-f/ option), +the scale factor will be reported as 1 unless this option is used. /para /listitem /varlistentry @@ -511,7 +526,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ termoption--select-only/option/term listitem para -Perform select-only transactions instead of TPC-B-like test. +Shorthand for option-b select-only@1/. /para /listitem /varlistentry @@ -567,6 +582,16 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ /varlistentry varlistentry + termoption--per-script-stats/option/term + listitem + para +Report some statistics per script run by pgbench. + /para + /listitem + /varlistentry + + + varlistentry termoption--sampling-rate=replaceablerate//option/term listitem para @@ -661,7 +686,20 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ titleWhat is the quoteTransaction/ Actually Performed in pgbench?/title para - The default transaction script issues seven commands per transaction: + Pgbench executes test scripts chosen randomly from a specified list. + They include built-in scripts with option-b/ and + user-provided custom scripts with option-f/. + Each script may be given a relative weight specified after a + literal@/ so as to change its drawing probability. + The default weight is literal1/. + /para + + para + The default builtin transaction script (also invoked with option-b tpcb-like/) + issues seven commands per transaction over randomly chosen literalaid/, + literaltid/, literalbid/ and
Re: [HACKERS] pgbench stats per script other stuff
v6 is just a rebase after a bug fix by Andres Freund. Also a small question: The patch currently displays pgbench scripts starting numbering at 0. Probably a little too geek... should start at 1? -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 2517a3a..99670d4 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -261,6 +261,23 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ benchmarking arguments: variablelist + varlistentry + termoption-b/ replaceablescriptname[@weight]//term + termoption--builtin/ replaceablescriptname[@weight]//term + listitem + para +Add the specified builtin script to the list of executed scripts. +An optional integer weight after literal@/ allows to adjust the +probability of drawing the test. +Available builtin scripts are: literaltpcb-like/, +literalsimple-update/ and literalselect-only/. +The provided repleacablescriptname/ needs only to be a prefix +of the builtin name, hence literalsimp/ would be enough to select +literalsimple-update/. + /para + /listitem + /varlistentry + varlistentry termoption-c/option replaceableclients//term @@ -307,14 +324,15 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ /varlistentry varlistentry - termoption-f/option replaceablefilename//term - termoption--file=/optionreplaceablefilename//term + termoption-f/ replaceablefilename[@weight]//term + termoption--file=/replaceablefilename[@weight]//term listitem para -Read transaction script from replaceablefilename/. +Add a transaction script read from replaceablefilename/ to +the list of executed scripts. +An optional integer weight after literal@/ allows to adjust the +probability of drawing the test. See below for details. -option-N/option, option-S/option, and option-f/option -are mutually exclusive. /para /listitem /varlistentry @@ -404,10 +422,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ termoption--skip-some-updates/option/term listitem para -Do not update structnamepgbench_tellers/ and -structnamepgbench_branches/. -This will avoid update contention on these tables, but -it makes the test case even less like TPC-B. +Shorthand for option-b simple-update@1/. /para /listitem /varlistentry @@ -499,9 +514,9 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ Report the specified scale factor in applicationpgbench/'s output. With the built-in tests, this is not necessary; the correct scale factor will be detected by counting the number of -rows in the structnamepgbench_branches/ table. However, when testing -custom benchmarks (option-f/ option), the scale factor -will be reported as 1 unless this option is used. +rows in the structnamepgbench_branches/ table. +However, when testing only custom benchmarks (option-f/ option), +the scale factor will be reported as 1 unless this option is used. /para /listitem /varlistentry @@ -511,7 +526,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ termoption--select-only/option/term listitem para -Perform select-only transactions instead of TPC-B-like test. +Shorthand for option-b select-only@1/. /para /listitem /varlistentry @@ -567,6 +582,16 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ /varlistentry varlistentry + termoption--per-script-stats/option/term + listitem + para +Report some statistics per script run by pgbench. + /para + /listitem + /varlistentry + + + varlistentry termoption--sampling-rate=replaceablerate//option/term listitem para @@ -661,7 +686,20 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ titleWhat is the quoteTransaction/ Actually Performed in pgbench?/title para - The default transaction script issues seven commands per transaction: + Pgbench executes test scripts chosen randomly from a specified list. + They include built-in scripts with option-b/ and + user-provided custom scripts with option-f/. + Each script may be given a relative weight specified after a + literal@/ so as to change its drawing probability. + The default weight is literal1/. + /para + + para + The default builtin transaction script (also invoked with option-b tpcb-like/) + issues seven commands per transaction over randomly chosen literalaid/, + literaltid/, literalbid/ and literalbalance/. + The scenario is inspired by the TPC-B benchmark,
Re: [HACKERS] pgbench stats per script other stuff
I liked @ because it makes sense to read it as the word at. Yep, why not. Prepending classic to the names does not look necessary. I would suggest tpcb-like, simple-update select-only, or even maybe any prefix. If the bench scripts could be read from some pg directory instead of being actually inlined, even more code could be dropped from pgbench. I think including classic would be a very good idea. Hmm. This is the command line, you have to type them! With a prefix-based approach this suggests that the builtin names must start differently so as to be easily selected. We might want to add a TPC-C like workload in the future, or any number of other things. Naming things in a good way from the outset can only make that easier. Here is a v4 which: - removes -w stuff - enhance -f with @weight - adds -b/--builtin name@weight, based on prefix builtin names are: tpcb-like, simple-update select-only, which matches their more or less historical names (although I wasn't sure of tpcb-sort-of, so I put tpcb-like) - removes -B (now can be accessed with -b tpcb-like) Pgbench builtin scripts are still inlined in the code, not in a separate directory, which might be an option to simplify the code and allow easy extensions. I still think that the --per-script-stats option is useless and per script stats should always be on as soon as several scripts are running. Even more, I think that stats (maybe no per-command stat though) should always be collected. The point of pgbench is to collect data, and the basic end-of-run tps summary is very terse and does not reflect much of what happened during the run. Also, maybe per-command detailed stats should use the same common struct to hold data as all other stats. I did not change it because it is maintained in a different part of the code. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 2517a3a..dc36f5d 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -260,7 +260,22 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ applicationpgbench/application accepts the following command-line benchmarking arguments: -variablelist + termoption-b/ replaceablescriptname[@weight]//term + termoption--builtin/ replaceablescriptname[@weight]//term + listitem + para +Add the specified builtin script to the list of executed scripts. +An optional integer weight after literal@/ allows to adjust the +probability of drawing the test. +Available builtin scripts are: literaltpcb-like/, +literalsimple-update/ and literalselect-only/. +The provided repleacablescriptname/ needs only to be a prefix +of the builtin name, hence literalsimp/ would be enough to select +literalsimple-update/. + /para + /listitem + /varlistentry + varlistentry termoption-c/option replaceableclients//term @@ -307,14 +322,15 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ /varlistentry varlistentry - termoption-f/option replaceablefilename//term - termoption--file=/optionreplaceablefilename//term + termoption-f/ replaceablefilename[@weight]//term + termoption--file=/replaceablefilename[@weight]//term listitem para -Read transaction script from replaceablefilename/. +Add a transaction script read from replaceablefilename/ to +the list of executed scripts. +An optional integer weight after literal@/ allows to adjust the +probability of drawing the test. See below for details. -option-N/option, option-S/option, and option-f/option -are mutually exclusive. /para /listitem /varlistentry @@ -404,10 +420,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ termoption--skip-some-updates/option/term listitem para -Do not update structnamepgbench_tellers/ and -structnamepgbench_branches/. -This will avoid update contention on these tables, but -it makes the test case even less like TPC-B. +Shorthand for option-b simple-update@1/. /para /listitem /varlistentry @@ -499,9 +512,9 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ Report the specified scale factor in applicationpgbench/'s output. With the built-in tests, this is not necessary; the correct scale factor will be detected by counting the number of -rows in the structnamepgbench_branches/ table. However, when testing -custom benchmarks (option-f/ option), the scale factor -will be reported as 1 unless this option is used. +rows in the structnamepgbench_branches/ table. +However, when testing only custom benchmarks (option-f/ option), +the scale factor will be reported
Re: [HACKERS] pgbench stats per script other stuff
Also, maybe per-command detailed stats should use the same common struct to hold data as all other stats. I did not change it because it is maintained in a different part of the code. I played just once with the --report-latencies option and was astonished that meta commands showed negative latencies... This v5 also fixes this bug (on meta commands there is a goto loop in doCustom, but as now was not reset the stmt_begin ended up being after now, hence accumulating increasing negative times) and in passing uses the same stats structure as the rest, which result in removing some more code. The report-latencies option is made to imply per script stats, which simplifies the final output code, and if you want per-command per-script stats, probably providing the per-script stats, i.e. the sum of the commands, make sense. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 2517a3a..dc36f5d 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -260,7 +260,22 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ applicationpgbench/application accepts the following command-line benchmarking arguments: -variablelist + termoption-b/ replaceablescriptname[@weight]//term + termoption--builtin/ replaceablescriptname[@weight]//term + listitem + para +Add the specified builtin script to the list of executed scripts. +An optional integer weight after literal@/ allows to adjust the +probability of drawing the test. +Available builtin scripts are: literaltpcb-like/, +literalsimple-update/ and literalselect-only/. +The provided repleacablescriptname/ needs only to be a prefix +of the builtin name, hence literalsimp/ would be enough to select +literalsimple-update/. + /para + /listitem + /varlistentry + varlistentry termoption-c/option replaceableclients//term @@ -307,14 +322,15 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ /varlistentry varlistentry - termoption-f/option replaceablefilename//term - termoption--file=/optionreplaceablefilename//term + termoption-f/ replaceablefilename[@weight]//term + termoption--file=/replaceablefilename[@weight]//term listitem para -Read transaction script from replaceablefilename/. +Add a transaction script read from replaceablefilename/ to +the list of executed scripts. +An optional integer weight after literal@/ allows to adjust the +probability of drawing the test. See below for details. -option-N/option, option-S/option, and option-f/option -are mutually exclusive. /para /listitem /varlistentry @@ -404,10 +420,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ termoption--skip-some-updates/option/term listitem para -Do not update structnamepgbench_tellers/ and -structnamepgbench_branches/. -This will avoid update contention on these tables, but -it makes the test case even less like TPC-B. +Shorthand for option-b simple-update@1/. /para /listitem /varlistentry @@ -499,9 +512,9 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ Report the specified scale factor in applicationpgbench/'s output. With the built-in tests, this is not necessary; the correct scale factor will be detected by counting the number of -rows in the structnamepgbench_branches/ table. However, when testing -custom benchmarks (option-f/ option), the scale factor -will be reported as 1 unless this option is used. +rows in the structnamepgbench_branches/ table. +However, when testing only custom benchmarks (option-f/ option), +the scale factor will be reported as 1 unless this option is used. /para /listitem /varlistentry @@ -511,7 +524,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ termoption--select-only/option/term listitem para -Perform select-only transactions instead of TPC-B-like test. +Shorthand for option-b select-only@1/. /para /listitem /varlistentry @@ -567,6 +580,16 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ /varlistentry varlistentry + termoption--per-script-stats/option/term + listitem + para +Report some statistics per script run by pgbench. + /para + /listitem + /varlistentry + + + varlistentry termoption--sampling-rate=replaceablerate//option/term listitem para @@ -661,7 +684,20 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ titleWhat is the quoteTransaction/ Actually Performed in pgbench?/title
Re: [HACKERS] pgbench stats per script other stuff
On Wed, Jul 22, 2015 at 1:54 PM, Josh Berkus j...@agliodbs.com wrote: If so, I would vote for: -f script1.bench:3 -f script2.bench:1 over: -f script1.bench -w 3 -f script2.bench -w 1 Making command-line options order-dependant breaks a lot of system call libraries in various languages, as well as being easy to mess up. Yes, I think that's a good idea. I don't know whether : is the right separator; I kind of line @. But that's bikeshedding. As Fabien mentions further downthread, it would be nice to set weights for the built-ins. I'd actually like to introduce a new pgbench option that selects a builtin script by name, so that we can have more than three of them without running out of option names (or going insane). So suppose we introduce pgbench -b BUILTIN_NAME, where BUILTIN_NAME is initially one of these: classic classic-simple-update classic-select-only Then you can do pgbench -b classic@1 -b classic-select-only@9 or similar to get 10% write, 90% read. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script other stuff
Yes, I think that's a good idea. I don't know whether : is the right separator; I kind of line @. But that's bikeshedding. Possible ASCII contenders should avoid shell and filename interaction, which exclude * ? ! / [ ] . - $ and so on: those that seem to remain are @ , = : % # +. I like % because this is about sharing, although this is not a percentage. I'd actually like to introduce a new pgbench option that selects a builtin script by name, so that we can have more than three of them without running out of option names (or going insane). So suppose we introduce pgbench -b BUILTIN_NAME, where BUILTIN_NAME is initially one of these: classic, classic-simple-update, classic-select-only Then you can do pgbench -b classic@1 -b classic-select-only@9 or similar to get 10% write, 90% read. I like this idea, as -b/-f would be symmetric. Prepending classic to the names does not look necessary. I would suggest tpcb-like, simple-update select-only, or even maybe any prefix. If the bench scripts could be read from some pg directory instead of being actually inlined, even more code could be dropped from pgbench. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script other stuff
Fabien COELHO wrote: [...] and that a subsequent -w modifies the meaning of the script-specifiying argument already read. That strikes me as a very unintuitive interface. Ok, I understand this afterward modification objection. What if the -w would be required *before*, and supply a weight for (the first/maybe all) script(s) specified *afterwards*, so it does not modify something already provided? I think it would be more intuitive, or at least less surprising. Here is a v3 which does that. If there is a better idea, do not hesitate! This seems a moderately reasonable interface to me. There are other programs that behave in that way, and once you get used to the idea, it makes sense. I think for complete consistency we would have to require that -w is specified for all scripts or none of them. I am not sure if this means that it's okay to have later scripts use a weight specified for a previous one (i.e. it's only an error to fail to specify a weight for options before the first -w), or each -f must have always its own -w explicitely. In other words, pg_bench -w2 -f script1.sql -f script2.sql either script2 has weight 2, or it's an error, depending on what we decide; but pg_bench -f script1.sql -w 2 -fscript2.sql is always an error. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script other stuff
On 07/21/2015 10:25 PM, Fabien COELHO wrote: Hello Josh, Maybe -f file.sql:weight (yuk from my point of view, but it can be done easily). Maybe it's past time for pgbench to have a config file? That is an idea. For simple usage, for backward compatibility and for people like me who like them, ISTM that options are fine too:-) Also this may mean adding a dependency to some YAML library, configure issues (I'm not sure whether pg currently uses YAML, and JSON is quite verbose), maybe conditionals around the feature to compile without the dependency, more documentation... I'm not sure all that is desirable just for weighting scripts. Maybe not. If so, I would vote for: -f script1.bench:3 -f script2.bench:1 over: -f script1.bench -w 3 -f script2.bench -w 1 Making command-line options order-dependant breaks a lot of system call libraries in various languages, as well as being easy to mess up. Given that we want to define some per-workload options, the config file would probably need to be YAML or JSON, e.g.: [...] script1: file: script1.bench weight: 3 script2: file: script2.bench weight: 1 the above would execute a pgbench with 16 clients, 4 threads, script1 three times as often as script2, and report stats at the script (rather than SQL statement) level. Yep. Probably numbering within field names should be avoided, so a list of records that could look like: Oh, you misunderstand. script1 and script2 are meant to be user-supplied names which then get reported in things like response time output. They're labels. Better example: deposit: file: deposit.bench weight: 3 withdrawal: file: withdrawal.bench weight: 3 reporting: file: summary_report.bench weigh: 1 -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script other stuff
If so, I would vote for: -f script1.bench:3 -f script2.bench:1 over: -f script1.bench -w 3 -f script2.bench -w 1 Ok, I'll take that into consideration. Any other opinion out there? The current v3 version is: -w 3 -f script1.bench -w 1 -f script2.bench With provision to generate errors if a -w is set but not used, in two case. - in the middle ... -w 4 no script option... -w 1 ... - in the end ... -w 1 no script option... I can provide -f x:weight easilly, but this mean that there will be no way to associate weight for internal scripts. Not orthogonal, not very elegant, but no big deal. Oh, you misunderstand. script1 and script2 are meant to be user-supplied names which then get reported in things like response time output. They're labels. Ok, that is much better. This means that labels should not choose names which may interact with other commands, so maybe a list would have been nice as well. Anyway, I do not think it is the way to go just for this feature. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script other stuff
On 2015-07-22 10:54:14 -0700, Josh Berkus wrote: Making command-line options order-dependant breaks a lot of system call libraries in various languages, as well as being easy to mess up. What? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script other stuff
On Tue, Jul 21, 2015 at 10:42 AM, Fabien COELHO coe...@cri.ensmp.fr wrote: sh ./pgbench -T 3 -B -N -w 2 -S -w 7 --per-script-stats That is a truly horrifying abuse of command-line arguments. -1 from me, or minus more than one if I've got that many chits to burn. Are you against the -w, or against saying that pgbench execute scripts, whether internal or from files? I'm against the idea that we accept multiple arguments for scripts, and that a subsequent -w modifies the meaning of the script-specifiying argument already read. That strikes me as a very unintuitive interface. I'm not sure exactly what would be better at the moment, but I think we need something better. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script other stuff
5~5~5~ That is a truly horrifying abuse of command-line arguments. -1 from me, or minus more than one if I've got that many chits to burn. Are you against the -w, or against saying that pgbench execute scripts, whether internal or from files? I'm against the idea that we accept multiple arguments for scripts, Pgbench *currently* already accept multiple -f ... options, and this is a good thing to test realistic loads which may intermix several kind of transactions, say a lot of readonly and some update or insert, and very rare deletes... Now if you do not need it you do not use it, and all is fine. Once you have several scripts, being able to weight them becomes useful for realism. and that a subsequent -w modifies the meaning of the script-specifiying argument already read. That strikes me as a very unintuitive interface. Ok, I understand this afterward modification objection. What if the -w would be required *before*, and supply a weight for (the first/maybe all) script(s) specified *afterwards*, so it does not modify something already provided? I think it would be more intuitive, or at least less surprising. I'm not sure exactly what would be better at the moment, but I think we need something better. Maybe -f file.sql:weight (yuk from my point of view, but it can be done easily). -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script other stuff
Hello Robert, Pgbench *currently* already accept multiple -f ... options, and this is a good thing to test realistic loads which may intermix several kind of transactions, say a lot of readonly and some update or insert, and very rare deletes... Hmm, I didn't realize that. The code looks a bit inconsistent right now - e.g. we do support multiple files, but pgbench's options-parsing loop sets ttype to a value that depends only on the last of -f, -N, and -S encountered. Indeed. However as with current pgbench nothing/-N/-S and -f are mutually exclusive it is ok to have ttype set as it is. With the patch pgbench just executes scripts and the options are not mutually exclusive: some scripts are internal and others are not, but they are treated the same beyond initialization, which helps removing some code including the ttype variable you mention. The name of the script is kept in an SQLScript struct along with its commands, weight and stats. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script other stuff
[...] and that a subsequent -w modifies the meaning of the script-specifiying argument already read. That strikes me as a very unintuitive interface. Ok, I understand this afterward modification objection. What if the -w would be required *before*, and supply a weight for (the first/maybe all) script(s) specified *afterwards*, so it does not modify something already provided? I think it would be more intuitive, or at least less surprising. Here is a v3 which does that. If there is a better idea, do not hesitate! sh ./pgbench -w 9 -f one.sql -f now.sql -T 2 -P 1 --per-script-stats starting vacuum...end. progress: 1.0 s, 24536.0 tps, lat 0.039 ms stddev 0.024 progress: 2.0 s, 25963.8 tps, lat 0.038 ms stddev 0.015 transaction type: multiple scripts scaling factor: 1 query mode: simple number of clients: 1 number of threads: 1 duration: 2 s number of transactions actually processed: 50501 latency average = 0.039 ms latency stddev = 0.020 ms tps = 25249.464772 (including connections establishing) tps = 25339.454154 (excluding connections establishing) SQL script 0, weight 9: one.sql - 45366 transactions (89.8% of total, tps = 22682.070035) - latency average = 0.038 ms - latency stddev = 0.016 ms SQL script 1, weight 1: now.sql - 5135 transactions (10.2% of total, tps = 2567.394737) - latency average = 0.044 ms - latency stddev = 0.041 ms -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 2517a3a..6bac511 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -261,6 +261,18 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ benchmarking arguments: variablelist + varlistentry + termoption-B/option/term + termoption--tpc-b/option/term + listitem + para +Built-in TPC-B like test which updates three tables and performs +an insert on a fourth. See below for details. +This is the default if no bench is explicitely specified. + /para + /listitem + /varlistentry + varlistentry termoption-c/option replaceableclients//term @@ -313,8 +325,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ para Read transaction script from replaceablefilename/. See below for details. -option-N/option, option-S/option, and option-f/option -are mutually exclusive. +Multiple option-f/ options are allowed. /para /listitem /varlistentry @@ -404,10 +415,10 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ termoption--skip-some-updates/option/term listitem para -Do not update structnamepgbench_tellers/ and -structnamepgbench_branches/. -This will avoid update contention on these tables, but -it makes the test case even less like TPC-B. +Built-in test which updates only one table compared to option-B/. +This avoids update contention on structnamepgbench_tellers/ +and structnamepgbench_branches/, but it makes the test case +even less like TPC-B. /para /listitem /varlistentry @@ -499,9 +510,9 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ Report the specified scale factor in applicationpgbench/'s output. With the built-in tests, this is not necessary; the correct scale factor will be detected by counting the number of -rows in the structnamepgbench_branches/ table. However, when testing -custom benchmarks (option-f/ option), the scale factor -will be reported as 1 unless this option is used. +rows in the structnamepgbench_branches/ table. +However, when testing only custom benchmarks (option-f/ option), +the scale factor will be reported as 1 unless this option is used. /para /listitem /varlistentry @@ -511,7 +522,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ termoption--select-only/option/term listitem para -Perform select-only transactions instead of TPC-B-like test. +Built-in test with select-only transactions. /para /listitem /varlistentry @@ -552,6 +563,20 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ /varlistentry varlistentry + termoption-w/option/term + termoption--weight/option/term + listitem + para +Set the integer weight for the next test script +(option-B/, option-N/, option-S/ or option-f/). +When several test scripts are used, the relative probability of +drawing each test is proportional to these weights. +The default weight is literal1/. + /para + /listitem + /varlistentry + + varlistentry termoption--aggregate-interval=replaceableseconds//option/term listitem para @@
Re: [HACKERS] pgbench stats per script other stuff
On Tue, Jul 21, 2015 at 12:29 PM, Fabien COELHO coe...@cri.ensmp.fr wrote: Pgbench *currently* already accept multiple -f ... options, and this is a good thing to test realistic loads which may intermix several kind of transactions, say a lot of readonly and some update or insert, and very rare deletes... Hmm, I didn't realize that. The code looks a bit inconsistent right now - e.g. we do support multiple files, but pgbench's options-parsing loop sets ttype to a value that depends only on the last of -f, -N, and -S encountered. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script other stuff
On 07/21/2015 09:29 AM, Fabien COELHO wrote: Maybe -f file.sql:weight (yuk from my point of view, but it can be done easily). Maybe it's past time for pgbench to have a config file? Given that we want to define some per-workload options, the config file would probably need to be YAML or JSON, e.g.: pgbench --config=workload1.pgb workload1.pgb - database: bench port: 5432 host: localhost user: josh clients : 16 threads : 4 response-times : on stats-level: script script1: file: script1.bench weight: 3 script2: file: script2.bench weight: 1 the above would execute a pgbench with 16 clients, 4 threads, script1 three times as often as script2, and report stats at the script (rather than SQL statement) level. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script other stuff
On Fri, Jul 17, 2015 at 9:50 AM, Fabien coe...@cri.ensmp.fr wrote: sh ./pgbench -T 3 -B -N -w 2 -S -w 7 --per-script-stats That is a truly horrifying abuse of command-line arguments. -1 from me, or minus more than one if I've got that many chits to burn. I have been thinking that the way to do this is to push more into the script file itself, e.g. allow: \if random() 0.1 stuff \else other stuff \endif Maybe that's overkill and there's some way of specifying multiple scripts on the command line, but IMO what you've got here is not it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script other stuff
Hello Robert, On Fri, Jul 17, 2015 at 9:50 AM, Fabien coe...@cri.ensmp.fr wrote: sh ./pgbench -T 3 -B -N -w 2 -S -w 7 --per-script-stats That is a truly horrifying abuse of command-line arguments. -1 from me, or minus more than one if I've got that many chits to burn. Are you against the -w, or against saying that pgbench execute scripts, whether internal or from files? The former is obviously a matter of taste and I can remove -w if nobody wants it, too bad because the feature seems useful to me from a testing point of view, this is a choice between aesthetic and feature. Note that you do not have to use it if you do not like it. The later really homogeneise the code internally and allows to factor out things, to have orthogonal features (internal scripts are treated the same way as external files, this requires less lines of code because it is simpler), and does not harm anyone IMO, so it would be sad to let it go. I have been thinking that the way to do this is to push more into the script file itself, e.g. allow: \if random() 0.1 stuff \else other stuff \endif Maybe that's overkill and there's some way of specifying multiple scripts on the command line, but IMO what you've got here is not it. I think that is overkill, and moreover it is not useful: the point is to collect statistics *per scripts*, with an random if you would not know which part was executed, so you would loose the whole point of having per script stats. If you have another suggestion about how to provide weights, which does not rely on ifs nor on options? Maybe a special comment in the script (yuk from my point of view because the script would carry its weight whereas I think this should be orthogonal to the script contents, but it would be better than nothing..). -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench stats per script other stuff
Oops, as usual I forgot something... This v2 removes old stats code that was put in comment and simplify the logic when counting lag times, as they are now taken into account at the end of the transaction instead of at the beginning. This patch adds per-script statistics other improvements to pgbench Rationale: Josh asked for the per-script stats:-) Some restructuring is done so that all stats (-l --aggregate-interval --progress --per-script-stats, latency lag...) share the same structures and functions to accumulate data. This limits a lot the growth of pgbench from this patch (+17 lines). In passing, remove the distinction between internal and external scripts. Pgbench just execute scripts, some of them may be internal... As a side effect, all scripts can be accumulated pgbench -B -N -S -f ... would execute 4 scripts, 3 of which internal (tpc-b, simple-update, select-only and another externally supplied one). Also add a weight option to change the probability of choosing some scripts when several are available. Hmmm... Not sure that the --per-script-stats option is really useful. The stats could always be shown when several scripts are executed? sh ./pgbench -T 3 -B -N -w 2 -S -w 7 --per-script-stats starting vacuum...end. transaction type: multiple scripts scaling factor: 1 query mode: simple number of clients: 1 number of threads: 1 duration: 3 s number of transactions actually processed: 3192 latency average: 0.940 ms tps = 1063.756045 (including connections establishing) tps = 1065.412737 (excluding connections establishing) SQL script 0: builtin: TPC-B (sort of) - weight is 1 - 297 transactions (tps = 98.977301) - latency average = 3.001 ms - latency stddev = 1.320 ms SQL script 1: builtin: simple update - weight is 2 - 621 transactions (tps = 206.952539) - latency average = 2.506 ms - latency stddev = 1.194 ms SQL script 2: builtin: select only - weight is 7 - 2274 transactions (tps = 757.826205) - latency average = 0.236 ms - latency stddev = 0.083 ms -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 2517a3a..eb571a8 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -261,6 +261,18 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ benchmarking arguments: variablelist + varlistentry + termoption-B/option/term + termoption--tpc-b/option/term + listitem + para +Built-in TPC-B like test which updates three tables and performs +an insert on a fourth. See below for details. +This is the default if no bench is explicitely specified. + /para + /listitem + /varlistentry + varlistentry termoption-c/option replaceableclients//term @@ -313,8 +325,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ para Read transaction script from replaceablefilename/. See below for details. -option-N/option, option-S/option, and option-f/option -are mutually exclusive. +Multiple option-f/ options are allowed. /para /listitem /varlistentry @@ -404,10 +415,10 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ termoption--skip-some-updates/option/term listitem para -Do not update structnamepgbench_tellers/ and -structnamepgbench_branches/. -This will avoid update contention on these tables, but -it makes the test case even less like TPC-B. +Built-in test which updates only one table compared to option-B/. +This avoids update contention on structnamepgbench_tellers/ +and structnamepgbench_branches/, but it makes the test case +even less like TPC-B. /para /listitem /varlistentry @@ -499,9 +510,9 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ Report the specified scale factor in applicationpgbench/'s output. With the built-in tests, this is not necessary; the correct scale factor will be detected by counting the number of -rows in the structnamepgbench_branches/ table. However, when testing -custom benchmarks (option-f/ option), the scale factor -will be reported as 1 unless this option is used. +rows in the structnamepgbench_branches/ table. +However, when testing only custom benchmarks (option-f/ option), +the scale factor will be reported as 1 unless this option is used. /para /listitem /varlistentry @@ -511,7 +522,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ termoption--select-only/option/term listitem para -Perform select-only transactions instead of TPC-B-like test. +Built-in test with select-only transactions. /para /listitem /varlistentry @@ -552,6 +563,20 @@