subject:"Re\: \[HACKERS\] pgbench stats per script \& other stuff"

Re: [HACKERS] pgbench stats per script & other stuff

2016-03-29 Thread Fabien COELHO




Indeed. The documentation is manually edited when submitting changes so as
to minimize diffs, but then it does not correspond anymore to any actual
output, so it is easy to do it wrong.


Well, you fixed the "latency stddev" line to the sample output too, but
in my trial run that line was not displayed, only the latency average.
What are the command line args that supposedly produced this output?
Maybe we should add it as a SGML comment, or even display it to the
doc's reader.


Good point.

The test above shows the stats if there was -P , -L & --rate, because 
under these conditions the necessary data was collected, so they can be 
computed. Thus the output in the documentation assumes that one of these 
was used. I nearly always use "-P 1".


Note that the documentation is not really precise, "will look similar to", 
so there is no commitment.


If you feel like removing the stddev line from the doc because it is not 
there with usual options, fine with me.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2016-03-29 Thread Alvaro Herrera

Fabien COELHO wrote:
> 
> Hello,
> 
> >>In doing this, I noticed that the latency output is wrong if you use -T
> >>instead of -t; it always says the latency is zero because "duration" is
> >>zero.  I suppose it should be like in the attached instead.
> 
> Indeed, I clearly overlooked option -t (transactions) which I never use.

Makes sense.

> >Patch actually attached here.
> 
> Tested. There is a small issue because the \n is missing.
> 
> Here is another version which just replaces duration by time_include,
> as they should be pretty close, and fixes the style so that it is the same
> whether the detailed stats are collected or not, as you pointed out.

Thanks, that makes sense.

> >>At the same time, it says "latency average: XYZ" instead of "latency
> >>average = XYZ" as in printSimpleStats, which doesn't look terribly
> >>important.  But the line appears in the SGML docs.
> 
> Indeed. The documentation is manually edited when submitting changes so as
> to minimize diffs, but then it does not correspond anymore to any actual
> output, so it is easy to do it wrong.

Well, you fixed the "latency stddev" line to the sample output too, but
in my trial run that line was not displayed, only the latency average.
What are the command line args that supposedly produced this output?
Maybe we should add it as a SGML comment, or even display it to the
doc's reader.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2016-03-29 Thread Fabien COELHO



Hello,


In doing this, I noticed that the latency output is wrong if you use -T
instead of -t; it always says the latency is zero because "duration" is
zero.  I suppose it should be like in the attached instead.


Indeed, I clearly overlooked option -t (transactions) which I never use.


Patch actually attached here.


Tested. There is a small issue because the \n is missing.

Here is another version which just replaces duration by time_include,
as they should be pretty close, and fixes the style so that it is the same 
whether the detailed stats are collected or not, as you pointed out.


At the same time, it says "latency average: XYZ" instead of "latency 
average = XYZ" as in printSimpleStats, which doesn't look terribly 
important.  But the line appears in the SGML docs.


Indeed. The documentation is manually edited when submitting changes so as 
to minimize diffs, but then it does not correspond anymore to any actual 
output, so it is easy to do it wrong.


--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 06cd5db..297af4f 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -1253,8 +1253,8 @@ number of clients: 10
 number of threads: 1
 number of transactions per client: 1000
 number of transactions actually processed: 1/1
-latency average = 15.844 ms
-latency stddev = 2.715 ms
+latency average: 15.844 ms
+latency stddev: 2.715 ms
 tps = 618.764555 (including connections establishing)
 tps = 622.977698 (excluding connections establishing)
 script statistics:
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 52d1223..d5380d2 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -3099,8 +3099,8 @@ printSimpleStats(char *prefix, SimpleStats *ss)
 	double		latency = ss->sum / ss->count;
 	double		stddev = sqrt(ss->sum2 / ss->count - latency * latency);
 
-	printf("%s average = %.3f ms\n", prefix, 0.001 * latency);
-	printf("%s stddev = %.3f ms\n", prefix, 0.001 * stddev);
+	printf("%s average: %.3f ms\n", prefix, 0.001 * latency);
+	printf("%s stddev: %.3f ms\n", prefix, 0.001 * stddev);
 }
 
 /* print out results */
@@ -3155,7 +3155,7 @@ printResults(TState *threads, StatsData *total, instr_time total_time,
 	else
 		/* only an average latency computed from the duration is available */
 		printf("latency average: %.3f ms\n",
-			   1000.0 * duration * nclients / total->cnt);
+			   1000.0 * time_include * nclients / total->cnt);
 
 	if (throttle_delay)
 	{

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2016-03-29 Thread Alvaro Herrera

Alvaro Herrera wrote:

> In doing this, I noticed that the latency output is wrong if you use -T
> instead of -t; it always says the latency is zero because "duration" is
> zero.  I suppose it should be like in the attached instead.  At the same
> time, it says "latency average: XYZ" instead of "latency average = XYZ"
> as in printSimpleStats, which doesn't look terribly important.  But the
> line appears in the SGML docs.

Patch actually attached here.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 52d1223..5cb5906 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -3152,10 +3152,13 @@ printResults(TState *threads, StatsData *total, instr_time total_time,
 
 	if (throttle_delay || progress || latency_limit)
 		printSimpleStats("latency", >latency);
-	else
+	else if (duration > 0)
 		/* only an average latency computed from the duration is available */
-		printf("latency average: %.3f ms\n",
+		printf("latency average: %.3f ms",
 			   1000.0 * duration * nclients / total->cnt);
+	else
+		printf("latency average: %.3f ms",
+			   1000.0 * time_include * nclients / total->cnt);
 
 	if (throttle_delay)
 	{

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2016-03-29 Thread Alvaro Herrera

Fabien COELHO wrote:
> 
> >- that it does work:-) I'm not sure what happens by the script selection
> >  process, it should be checked carefully because it was not designed
> >  with allowing a zero weight, and it may depend on its/their positions.
> >  It may already work, but it really needs checking.
> 
> Hmmm, it seems ok.

It's not -- if you used -i, it died saying weight is zero.

> >- I would suggest that a warning is shown when a weight is zero,
> >  something like "warning, script #%d weight is zero, will be ignored".
> 
> includes such a warning.

I didn't include this part.

Pushed.

In doing this, I noticed that the latency output is wrong if you use -T
instead of -t; it always says the latency is zero because "duration" is
zero.  I suppose it should be like in the attached instead.  At the same
time, it says "latency average: XYZ" instead of "latency average = XYZ"
as in printSimpleStats, which doesn't look terribly important.  But the
line appears in the SGML docs.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2016-03-28 Thread Fabien COELHO




- that it does work:-) I'm not sure what happens by the script selection
  process, it should be checked carefully because it was not designed
  with allowing a zero weight, and it may depend on its/their positions.
  It may already work, but it really needs checking.


Hmmm, it seems ok.

Attached is an updated patch, which:


- I would suggest that a warning is shown when a weight is zero,
  something like "warning, script #%d weight is zero, will be ignored".


includes such a warning.


- the documentation should be updated:-)


adds a line about 0 weight in the documentation.

--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index c6d1454..e31d5ee 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -698,6 +698,7 @@ pgbench  options  dbname
Each script may be given a relative weight specified after a
@ so as to change its drawing probability.
The default weight is 1.
+   Weight 0 scripts are ignored.
  
 
   
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 4196b0e..0c1a0ee 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -2953,10 +2953,10 @@ parseScriptWeight(const char *option, char **script)
 			fprintf(stderr, "invalid weight specification: %s\n", sep);
 			exit(1);
 		}
-		if (wtmp > INT_MAX || wtmp <= 0)
+		if (wtmp > INT_MAX || wtmp < 0)
 		{
 			fprintf(stderr,
-			"weight specification out of range (1 .. %u): " INT64_FORMAT "\n",
+			"weight specification out of range (0 .. %u): " INT64_FORMAT "\n",
 	INT_MAX, (int64) wtmp);
 			exit(1);
 		}
@@ -2987,6 +2987,13 @@ addScript(ParsedScript script)
 		exit(1);
 	}
 
+	if (script.weight == 0)
+	{
+		fprintf(stderr,
+"warning, script \"%s\" (%d) weight is zero, will be ignored\n",
+script.desc, num_scripts);
+	}
+
 	sql_script[num_scripts] = script;
 	num_scripts++;
 }
@@ -3527,6 +3534,12 @@ main(int argc, char **argv)
 		/* cannot overflow: weight is 32b, total_weight 64b */
 		total_weight += sql_script[i].weight;
 
+	if (total_weight == 0)
+	{
+		fprintf(stderr, "total weight for scripts (-b and -f options) cannot be zero\n");
+		exit(1);
+	}
+
 	/* show per script stats if several scripts are used */
 	if (num_scripts > 1)
 		per_script_stats = true;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2016-03-20 Thread Fabien COELHO



Hello Jeff,


So I wanted to do something like:

for f in `seq 0 5 100`; do
 pgbench -T 180 -c8 -j8 -b tpcb-like@$f -b select-only@100
done;

But, I'm not allowed to specify a weight of zero.


Indeed. I did not envision such a use case, but it is quite legitimate and 
interesting! I would hope that the behavior would be a linear combination 
of the raw performance of each script, but whether it is indeed the case 
is not that sure.



Would this be a welcome change?


Speaking for myself, I would be fine with such a change, provided:

 - that it does work:-) I'm not sure what happens by the script selection
   process, it should be checked carefully because it was not designed
   with allowing a zero weight, and it may depend on its/their positions.
   It may already work, but it really needs checking.

 - I would suggest that a warning is shown when a weight is zero,
   something like "warning, script #%d weight is zero, will be ignored".

 - the documentation should be updated:-)

--
Fabien


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2016-03-19 Thread Jeff Janes

On Fri, Jul 17, 2015 at 6:50 AM, Fabien  wrote:
>
> This patch adds per-script statistics & other improvements to pgbench
>
> Rationale: Josh asked for the per-script stats:-)
>
> Some restructuring is done so that all stats (-l --aggregate-interval
> --progress --per-script-stats, latency & lag...) share the same structures
> and functions to accumulate data. This limits a lot the growth of pgbench
> from this patch (+17 lines).
>
> In passing, remove the distinction between internal and external scripts.
> Pgbench just execute scripts, some of them may be internal...
>
> As a side effect, all scripts can be accumulated "pgbench -B -N -S -f ..."
> would execute 4 scripts, 3 of which internal (tpc-b, simple-update,
> select-only and another externally supplied one).
>
> Also add a weight option to change the probability of choosing some scripts
> when several are available.

I was eager to use this to do some performance testing on a series of
workloads gradually transitioning from write-heavy to read-only.

So I wanted to do something like:

for f in `seq 0 5 100`; do
  pgbench -T 180 -c8 -j8 -b tpcb-like@$f -b select-only@100
done;

But, I'm not allowed to specify a weight of zero.  That means I have
to special-case the first iteration of the "for" loop where $f is
zero.  I think it would be more convenient if I was allowed to specify
a zero weight, and the script would just ignore that script.  All I
had to do to make this work is remove the check that prevents from
setting the weight to zero.  But then I would need to add in a check
that the sum of all weights is not zero, which I have done here.

We could get more complex by not adding a zero-weight script into the
array of scripts at all, rather than adding it in a way where it can
never be selected.  But then that would complicate the parsing of the
per-script stats report, when one of the scripts was no longer
reported.  I like this way better.

Would this be a welcome change?

Cheers,

Jeff

pgbench_zero_weight.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2016-03-19 Thread Jeff Janes

On Sat, Mar 19, 2016 at 11:34 AM, Alvaro Herrera
 wrote:
> Jeff Janes wrote:
>> On Sat, Mar 19, 2016 at 8:41 AM, Alvaro Herrera
>>  wrote:
>> > I pushed your 25, with some additional minor tweaks.  I hope I didn't
>> > break anything; please test.
>>
>> I'm now getting compiler warnings:
>>
>> gcc version 4.4.7 20120313 (Red Hat 4.4.7-16) (GCC)
>>
>>
>> pgbench.c: In function 'process_builtin':
>> pgbench.c:2765: warning: 'ps.stats.lag.sum2' is used uninitialized in
>> this function
>
> Fair complaints.  I suppose the following should fix them?


Yes, that fixes them.

Thanks,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2016-03-19 Thread Alvaro Herrera

Jeff Janes wrote:
> On Sat, Mar 19, 2016 at 8:41 AM, Alvaro Herrera
>  wrote:
> > I pushed your 25, with some additional minor tweaks.  I hope I didn't
> > break anything; please test.
> 
> I'm now getting compiler warnings:
> 
> gcc version 4.4.7 20120313 (Red Hat 4.4.7-16) (GCC)
> 
> 
> pgbench.c: In function 'process_builtin':
> pgbench.c:2765: warning: 'ps.stats.lag.sum2' is used uninitialized in
> this function

Fair complaints.  I suppose the following should fix them?


-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index ecabff0..4606fb0 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -2649,11 +2649,23 @@ read_line_from_file(FILE *fd)
 }
 
 /*
+ * Initialize a ParsedScript
+ */
+static void
+initParsedScript(ParsedScript *ps, const char *desc, int alloc_num, int weight)
+{
+	ps->commands = (Command **) pg_malloc(sizeof(Command *) * alloc_num);
+	ps->desc = desc;
+	ps->weight = weight;
+	initStats(>stats, 0.0);
+}
+
+/*
  * Given a file name, read it and return its ParsedScript representation.  "-"
  * means to read stdin.
  */
 static ParsedScript
-process_file(char *filename)
+process_file(char *filename, int weight)
 {
 #define COMMANDS_ALLOC_NUM 128
 	ParsedScript ps;
@@ -2673,8 +2685,7 @@ process_file(char *filename)
 	}
 
 	alloc_num = COMMANDS_ALLOC_NUM;
-	ps.commands = (Command **) pg_malloc(sizeof(Command *) * alloc_num);
-	ps.desc = filename;
+	initParsedScript(, filename, alloc_num, weight);
 
 	lineno = 0;
 	index = 0;
@@ -2710,7 +2721,7 @@ process_file(char *filename)
 
 /* Parse the given builtin script and return the parsed representation */
 static ParsedScript
-process_builtin(BuiltinScript *bi)
+process_builtin(BuiltinScript *bi, int weight)
 {
 	int			lineno,
 index;
@@ -2720,8 +2731,7 @@ process_builtin(BuiltinScript *bi)
 	ParsedScript ps;
 
 	alloc_num = COMMANDS_ALLOC_NUM;
-	ps.desc = bi->desc;
-	ps.commands = (Command **) pg_malloc(sizeof(Command *) * alloc_num);
+	initParsedScript(, bi->desc, alloc_num, weight);
 
 	lineno = 0;
 	index = 0;
@@ -2860,7 +2870,7 @@ parseScriptWeight(const char *option, char **script)
 
 /* append a script to the list of scripts to process */
 static void
-addScript(ParsedScript script, int weight)
+addScript(ParsedScript script)
 {
 	if (script.commands == NULL || script.commands[0] == NULL)
 	{
@@ -2875,8 +2885,6 @@ addScript(ParsedScript script, int weight)
 	}
 
 	sql_script[num_scripts] = script;
-	sql_script[num_scripts].weight = weight;
-	initStats(_script[num_scripts].stats, 0.0);
 	num_scripts++;
 }
 
@@ -3251,24 +3259,24 @@ main(int argc, char **argv)
 }
 
 weight = parseScriptWeight(optarg, );
-addScript(process_builtin(findBuiltin(script)), weight);
+addScript(process_builtin(findBuiltin(script), weight));
 benchmarking_option_set = true;
 internal_script_used = true;
 break;
 
 			case 'S':
-addScript(process_builtin(findBuiltin("select-only")), 1);
+addScript(process_builtin(findBuiltin("select-only"), 1));
 benchmarking_option_set = true;
 internal_script_used = true;
 break;
 			case 'N':
-addScript(process_builtin(findBuiltin("simple-update")), 1);
+addScript(process_builtin(findBuiltin("simple-update"), 1));
 benchmarking_option_set = true;
 internal_script_used = true;
 break;
 			case 'f':
 weight = parseScriptWeight(optarg, );
-addScript(process_file(script), weight);
+addScript(process_file(script, weight));
 benchmarking_option_set = true;
 break;
 			case 'D':
@@ -3406,7 +3414,7 @@ main(int argc, char **argv)
 	/* set default script if none */
 	if (num_scripts == 0 && !is_init_mode)
 	{
-		addScript(process_builtin(findBuiltin("tpcb-like")), 1);
+		addScript(process_builtin(findBuiltin("tpcb-like"), 1));
 		benchmarking_option_set = true;
 		internal_script_used = true;
 	}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2016-03-19 Thread Fabien COELHO



Hello Álvaro,


I pushed your 25, with some additional minor tweaks.  I hope I didn't
break anything; please test.


I've made a few tests and all looks well. I guess the build farm will say 
if it does not like it.


Thanks,

--
Fabien.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2016-03-19 Thread Alvaro Herrera

I pushed your 25, with some additional minor tweaks.  I hope I didn't
break anything; please test.

Fabien COELHO wrote:

> >I don't "prefer" memory leaks -- I prefer interfaces that make sense.
> 
> C is not designed to return two things, and if it is what is needed it looks
> awkward whatever is done. The static variable trick is dirty, but it is the
> minimal fuss solution, IMO. So we are only trading awkward code against
> awkward code.

That's true.
 
> I have very little time available, so I'm trying to minimize the effort.
> I've tried "argue my point with committers", but it has proven very
> ineffective. I've switched to "do whatever is asked if it still works", but
> it is not very effective either.

I understand.  Sometimes arguing is better, if you can convince the
other person, but sometimes the other person disagrees with you or they
are just not listening.  I don't have any useful advice on what to do,
but frequently resigning to do a stupid thing because somebody suggested
it leads to bad decisions.

 
> >In fact, with ParsedScript I don't think we need to give a name to the
> >anon struct used for builtin scripts.
> 
> It is useful that it has a name so that find_builtin can return it.

So it is.  I have kept it, but I used the name BuiltinScript rather than
script_t.

> Version v25 results a script which is then passed as an argument, so it
> avoid the dynamic allocation & later free. Maybe it is better. I had to cut
> short the error handling if a file does not exists, though, and it passes a
> struct by value.

Passing structs by value should work fine, and I don't care much about
the case that a file doesn't exist.

> >No need for N_BUILTIN; we can use lengthof(builtin_script) instead.
> 
> Indeed. "lengthof" does not seem to be standard... ok, it is a macro in some
> header file. I really wanted to avoid an ugly sizeof divide hack, but as it
> is hidden elsewhere this is fine.

Right.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2016-03-16 Thread Fabien COELHO



Hello Alvaro,


If somebody specifies thousands of -f switches, they will waste a few
bytes with each, but I'm hardly concerned about a few dozen kilobytes
there ...


Ok, so you prefer a memory leak. I hate it on principle.


I don't "prefer" memory leaks -- I prefer interfaces that make sense.


C is not designed to return two things, and if it is what is needed it 
looks awkward whatever is done. The static variable trick is dirty, but it 
is the minimal fuss solution, IMO. So we are only trading awkward code 
against awkward code.



Speaking of which, I don't think the arrangement in your patch really
does.  I know I suggested it,


Yep:-)

but now that I look again, it turns out I chose badly and you 
implemented a bad idea, so can we go back and fix it, please?


Yep.

I have very little time available, so I'm trying to minimize the effort. 
I've tried "argue my point with committers", but it has proven very 
ineffective. I've switched to "do whatever is asked if it still works", 
but it is not very effective either.



What I now think should really happen is that the current sql_scripts
array, currently under an anonymous struct, should be a typedef, say
ParsedScript,


Why not.


and get a new member for the weight;


Hm... it already contains "weight".

process_file and process_builtin return a ParsedScript.  The weight and 
Command ** should not be part of script_t at all.


Sure.

In fact, with ParsedScript I don't think we need to give a name to the 
anon struct used for builtin scripts.


It is useful that it has a name so that find_builtin can return it.


Rename the current sql_scripts.name to "desc", to mirror what
is actually put in there from the builtin array struct.  Make addScript
receive a ParsedScript and weight, fill in the weight into the struct,
and put it to the array after sanity-checking.  (I'm OK with keeping
"name" instead of renaming to "desc", if that change becomes too
invasive.)


See attached a v24 & v25.

The awkwardness in v24 is that functions allocate a struct which is freed 
afterwards, really just to return two data. Whether it is better or worst 
than a static is really a matter of taste.


Version v25 results a script which is then passed as an argument, so it 
avoid the dynamic allocation & later free. Maybe it is better. I had to 
cut short the error handling if a file does not exists, though, and it 
passes a struct by value.


Feel free to pick whichever you like most.


No need for N_BUILTIN; we can use lengthof(builtin_script) instead.


Indeed. "lengthof" does not seem to be standard... ok, it is a macro in 
some header file. I really wanted to avoid an ugly sizeof divide hack, but 
as it is hidden elsewhere this is fine.


--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index cc80b3f..dd3fb1d 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -262,11 +262,13 @@ pgbench  options  dbname
 
 
  
-  -b scriptname
-  --builtin scriptname
+  -b scriptname[@weight]
+  --builtin=scriptname[@weight]
   

 Add the specified builtin script to the list of executed scripts.
+An optional integer weight after @ allows to adjust the
+probability of drawing the script.  If not specified, it is set to 1.
 Available builtin scripts are: tpcb-like,
 simple-update and select-only.
 Unambiguous prefixes of builtin names are accepted.
@@ -322,12 +324,14 @@ pgbench  options  dbname
  
 
  
-  -f filename
-  --file=filename
+  -f filename[@weight]
+  --file=filename[@weight]
   

 Add a transaction script read from filename to
 the list of executed scripts.
+An optional integer weight after @ allows to adjust the
+probability of drawing the test.
 See below for details.

   
@@ -687,9 +691,13 @@ pgbench  options  dbname
   What is the Transaction Actually Performed in pgbench?
 
   
-   Pgbench executes test scripts chosen randomly from a specified list.
+   pgbench executes test scripts chosen randomly
+   from a specified list.
They include built-in scripts with -b and
user-provided custom scripts with -f.
+   Each script may be given a relative weight specified after a
+   @ so as to change its drawing probability.
+   The default weight is 1.
  
 
   
@@ -1194,12 +1202,11 @@ number of clients: 10
 number of threads: 1
 number of transactions per client: 1000
 number of transactions actually processed: 1/1
+latency average = 15.844 ms
+latency stddev = 2.715 ms
 tps = 618.764555 (including connections establishing)
 tps = 622.977698 (excluding connections establishing)
-SQL script 1: builtin: TPC-B (sort of)
- - 1 transactions (100.0% of total, tps = 618.764555)
- - latency average = 15.844 ms
- - latency stddev = 2.715 ms
+script statistics:
  - statement latencies in milliseconds:
 0.004386\set

Re: [HACKERS] pgbench stats per script & other stuff

2016-03-15 Thread Alvaro Herrera

Fabien COELHO wrote:

> >If somebody specifies thousands of -f switches, they will waste a few
> >bytes with each, but I'm hardly concerned about a few dozen kilobytes
> >there ...
> 
> Ok, so you prefer a memory leak. I hate it on principle.

I don't "prefer" memory leaks -- I prefer interfaces that make sense.
Speaking of which, I don't think the arrangement in your patch really
does.  I know I suggested it, but now that I look again, it turns out I
chose badly and you implemented a bad idea, so can we go back and fix
it, please?

What I now think should really happen is that the current sql_scripts
array, currently under an anonymous struct, should be a typedef, say
ParsedScript, and get a new member for the weight; process_file and
process_builtin return a ParsedScript.  The weight and Command ** should
not be part of script_t at all.  In fact, with ParsedScript I don't
think we need to give a name to the anon struct used for builtin
scripts.  Rename the current sql_scripts.name to "desc", to mirror what
is actually put in there from the builtin array struct.  Make addScript
receive a ParsedScript and weight, fill in the weight into the struct,
and put it to the array after sanity-checking.  (I'm OK with keeping
"name" instead of renaming to "desc", if that change becomes too
invasive.)

No need for N_BUILTIN; we can use lengthof(builtin_script) instead.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2016-03-15 Thread David Steele

On 3/4/16 1:53 PM, Fabien COELHO wrote:

>>> That is why the "fs" variable in process_file is declared "static",
>>> and why
>>> I wrote "some hidden awkwarness".
>>>
>>> I did want to avoid a malloc because then who would free the struct?
>>> addScript cannot to it systematically because builtins are static. Or it
>>> would have to create an on purpose struct, but I then that would be more
>>> awkwarness, and malloc/free to pass arguments between functions is not
>>> efficient nor very elegant.
>>>
>>> So the "static" option looked like the simplest & most elegant version.
>>
>> Surely that trick breaks if you have more than one -f switch, no?  Oh, I
>> see what you're doing: you only use the command list, which is
>> allocated, so it doesn't matter that the rest of the struct changes
>> later.
> 
> The two fields that matter (desc and commands) are really copied into
> sql_scripts, so what stays in the is overriden if used another time.
> 
>> I'm not concerned about freeing the struct; what's the problem with it
>> surviving until the program terminates?
> 
> It is not referenced anywhere so it is a memory leak.
> 
>> If somebody specifies thousands of -f switches, they will waste a few
>> bytes with each, but I'm hardly concerned about a few dozen kilobytes
>> there ...
> 
> Ok, so you prefer a memory leak. I hate it on principle.
> 
> Here is a v23 with a memory leak anyway.

Álvaro, it looks like you've been both reviewer and committer on this
work for some time.

The latest patch seems to address you final concern.  Can I mark it
"ready for committer"?

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2016-03-04 Thread Fabien COELHO




That is why the "fs" variable in process_file is declared "static", and why
I wrote "some hidden awkwarness".

I did want to avoid a malloc because then who would free the struct?
addScript cannot to it systematically because builtins are static. Or it
would have to create an on purpose struct, but I then that would be more
awkwarness, and malloc/free to pass arguments between functions is not
efficient nor very elegant.

So the "static" option looked like the simplest & most elegant version.


Surely that trick breaks if you have more than one -f switch, no?  Oh, I
see what you're doing: you only use the command list, which is
allocated, so it doesn't matter that the rest of the struct changes
later.


The two fields that matter (desc and commands) are really copied into 
sql_scripts, so what stays in the is overriden if used another time.



I'm not concerned about freeing the struct; what's the problem with it
surviving until the program terminates?


It is not referenced anywhere so it is a memory leak.

If somebody specifies thousands of -f switches, they will waste a few 
bytes with each, but I'm hardly concerned about a few dozen kilobytes 
there ...


Ok, so you prefer a memory leak. I hate it on principle.

Here is a v23 with a memory leak anyway.

--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index cc80b3f..dd3fb1d 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -262,11 +262,13 @@ pgbench  options  dbname
 
 
  
-  -b scriptname
-  --builtin scriptname
+  -b scriptname[@weight]
+  --builtin=scriptname[@weight]
   

 Add the specified builtin script to the list of executed scripts.
+An optional integer weight after @ allows to adjust the
+probability of drawing the script.  If not specified, it is set to 1.
 Available builtin scripts are: tpcb-like,
 simple-update and select-only.
 Unambiguous prefixes of builtin names are accepted.
@@ -322,12 +324,14 @@ pgbench  options  dbname
  
 
  
-  -f filename
-  --file=filename
+  -f filename[@weight]
+  --file=filename[@weight]
   

 Add a transaction script read from filename to
 the list of executed scripts.
+An optional integer weight after @ allows to adjust the
+probability of drawing the test.
 See below for details.

   
@@ -687,9 +691,13 @@ pgbench  options  dbname
   What is the Transaction Actually Performed in pgbench?
 
   
-   Pgbench executes test scripts chosen randomly from a specified list.
+   pgbench executes test scripts chosen randomly
+   from a specified list.
They include built-in scripts with -b and
user-provided custom scripts with -f.
+   Each script may be given a relative weight specified after a
+   @ so as to change its drawing probability.
+   The default weight is 1.
  
 
   
@@ -1194,12 +1202,11 @@ number of clients: 10
 number of threads: 1
 number of transactions per client: 1000
 number of transactions actually processed: 1/1
+latency average = 15.844 ms
+latency stddev = 2.715 ms
 tps = 618.764555 (including connections establishing)
 tps = 622.977698 (excluding connections establishing)
-SQL script 1: builtin: TPC-B (sort of)
- - 1 transactions (100.0% of total, tps = 618.764555)
- - latency average = 15.844 ms
- - latency stddev = 2.715 ms
+script statistics:
  - statement latencies in milliseconds:
 0.004386\set nbranches 1 * :scale
 0.001343\set ntellers 10 * :scale
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 8b0b17a..5363648 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -38,6 +38,7 @@
 #include "portability/instr_time.h"
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -179,6 +180,8 @@ char	   *login = NULL;
 char	   *dbName;
 const char *progname;
 
+#define WSEP '@'/* weight separator */
+
 volatile bool timer_exceeded = false;	/* flag from signal handler */
 
 /* variable definitions */
@@ -300,23 +303,27 @@ typedef struct
 static struct
 {
 	const char *name;
+	int			weight;
 	Command   **commands;
 	StatsData stats;
 }	sql_script[MAX_SCRIPTS];	/* SQL script files */
 static int	num_scripts;		/* number of scripts in sql_script[] */
 static int	num_commands = 0;	/* total number of Command structs */
+static int64 total_weight = 0;
+
 static int	debug = 0;			/* debug flag */
 
 /* Define builtin test scripts */
-#define N_BUILTIN 3
-static struct
+typedef struct script_t
 {
 	char	   *name;			/* very short name for -b ... */
 	char	   *desc;			/* short description */
-	char	   *commands;		/* actual pgbench script */
-}
+	char	   *script;			/* actual pgbench script */
+	Command   **commands; 		/* temporary intermediate holder */
+} script_t;
 
-			builtin_script[] =
+#define N_BUILTIN 3
+static script_t builtin_script[] =
 {
 	{
 		"tpcb-like",

Re: [HACKERS] pgbench stats per script & other stuff

2016-03-04 Thread Alvaro Herrera

Fabien COELHO wrote:

> >However, this is still a bit broken -- you cannot return a stack
> >variable from process_file, because the stack goes away once the
> >function returns.  You need to malloc it.
> 
> That is why the "fs" variable in process_file is declared "static", and why
> I wrote "some hidden awkwarness".
> 
> I did want to avoid a malloc because then who would free the struct?
> addScript cannot to it systematically because builtins are static. Or it
> would have to create an on purpose struct, but I then that would be more
> awkwarness, and malloc/free to pass arguments between functions is not
> efficient nor very elegant.
> 
> So the "static" option looked like the simplest & most elegant version.

Surely that trick breaks if you have more than one -f switch, no?  Oh, I
see what you're doing: you only use the command list, which is
allocated, so it doesn't matter that the rest of the struct changes
later.  That seems rather nasty to me -- I'd avoid that.

I'm not concerned about freeing the struct; what's the problem with it
surviving until the program terminates?  If somebody specifies thousands
of -f switches, they will waste a few bytes with each, but I'm hardly
concerned about a few dozen kilobytes there ...

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2016-03-04 Thread Fabien COELHO




  *-21.patch does what you suggested above, some hidden awkwardness
 but much less that the previous one.


Yeah, I think this is much nicer, don't you agree?


Yep, I said "less awkwarness than previous", a pretty contrived way to say 
better:-)



However, this is still a bit broken -- you cannot return a stack
variable from process_file, because the stack goes away once the
function returns.  You need to malloc it.


That is why the "fs" variable in process_file is declared "static", and 
why I wrote "some hidden awkwarness".


I did want to avoid a malloc because then who would free the struct? 
addScript cannot to it systematically because builtins are static. Or it 
would have to create an on purpose struct, but I then that would be more 
awkwarness, and malloc/free to pass arguments between functions is not 
efficient nor very elegant.


So the "static" option looked like the simplest & most elegant version.


Also, you forgot to update the comments in process_file,
process_builtin, etc.


Indeed. v22 attached with better comments.

--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index cc80b3f..dd3fb1d 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -262,11 +262,13 @@ pgbench  options  dbname
 
 
  
-  -b scriptname
-  --builtin scriptname
+  -b scriptname[@weight]
+  --builtin=scriptname[@weight]
   

 Add the specified builtin script to the list of executed scripts.
+An optional integer weight after @ allows to adjust the
+probability of drawing the script.  If not specified, it is set to 1.
 Available builtin scripts are: tpcb-like,
 simple-update and select-only.
 Unambiguous prefixes of builtin names are accepted.
@@ -322,12 +324,14 @@ pgbench  options  dbname
  
 
  
-  -f filename
-  --file=filename
+  -f filename[@weight]
+  --file=filename[@weight]
   

 Add a transaction script read from filename to
 the list of executed scripts.
+An optional integer weight after @ allows to adjust the
+probability of drawing the test.
 See below for details.

   
@@ -687,9 +691,13 @@ pgbench  options  dbname
   What is the Transaction Actually Performed in pgbench?
 
   
-   Pgbench executes test scripts chosen randomly from a specified list.
+   pgbench executes test scripts chosen randomly
+   from a specified list.
They include built-in scripts with -b and
user-provided custom scripts with -f.
+   Each script may be given a relative weight specified after a
+   @ so as to change its drawing probability.
+   The default weight is 1.
  
 
   
@@ -1194,12 +1202,11 @@ number of clients: 10
 number of threads: 1
 number of transactions per client: 1000
 number of transactions actually processed: 1/1
+latency average = 15.844 ms
+latency stddev = 2.715 ms
 tps = 618.764555 (including connections establishing)
 tps = 622.977698 (excluding connections establishing)
-SQL script 1: builtin: TPC-B (sort of)
- - 1 transactions (100.0% of total, tps = 618.764555)
- - latency average = 15.844 ms
- - latency stddev = 2.715 ms
+script statistics:
  - statement latencies in milliseconds:
 0.004386\set nbranches 1 * :scale
 0.001343\set ntellers 10 * :scale
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 8b0b17a..d7af86e 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -38,6 +38,7 @@
 #include "portability/instr_time.h"
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -179,6 +180,8 @@ char	   *login = NULL;
 char	   *dbName;
 const char *progname;
 
+#define WSEP '@'/* weight separator */
+
 volatile bool timer_exceeded = false;	/* flag from signal handler */
 
 /* variable definitions */
@@ -300,23 +303,27 @@ typedef struct
 static struct
 {
 	const char *name;
+	int			weight;
 	Command   **commands;
 	StatsData stats;
 }	sql_script[MAX_SCRIPTS];	/* SQL script files */
 static int	num_scripts;		/* number of scripts in sql_script[] */
 static int	num_commands = 0;	/* total number of Command structs */
+static int64 total_weight = 0;
+
 static int	debug = 0;			/* debug flag */
 
 /* Define builtin test scripts */
-#define N_BUILTIN 3
-static struct
+typedef struct script_t
 {
 	char	   *name;			/* very short name for -b ... */
 	char	   *desc;			/* short description */
-	char	   *commands;		/* actual pgbench script */
-}
+	char	   *script;			/* actual pgbench script */
+	Command   **commands; 		/* temporary intermediate holder */
+} script_t;
 
-			builtin_script[] =
+#define N_BUILTIN 3
+static script_t builtin_script[] =
 {
 	{
 		"tpcb-like",
@@ -334,7 +341,8 @@ static struct
 		"UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;\n"
 		"UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;\n"
 		"INSERT INTO

Re: [HACKERS] pgbench stats per script & other stuff

2016-03-04 Thread Alvaro Herrera

Fabien COELHO wrote:

Hi,

>   *-21.patch does what you suggested above, some hidden awkwardness
>  but much less that the previous one.

Yeah, I think this is much nicer, don't you agree?

However, this is still a bit broken -- you cannot return a stack
variable from process_file, because the stack goes away once the
function returns.  You need to malloc it.

Also, you forgot to update the comments in process_file,
process_builtin, etc.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2016-03-04 Thread Fabien COELHO



Hello Alvaro,


I looked at 19.d and I think the design has gotten pretty convoluted.  I
think we could simplify with the following changes:

struct script_t gets a new member, of type Command **, which is
initially null.

function process_builtin receives the complete script_t (not individual
memebers of it) constructs the Command ** array and puts it in
script_t's new member; return value is the same script_t struct it got
(except it's now augmented with the Command **array).

function process_file constructs a new script_t from the string list,
creates its Command **array just like process_builtin and returns the
constructed struct.

function addScript receives script_t instead of individual members of
it, and does the appropriate thing.


Why not. Here are two versions:

  *-20.patch is the initial rebased version

  *-21.patch does what you suggested above, some hidden awkwardness
 but much less that the previous one.

--
Fabiendiff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index cc80b3f..dd3fb1d 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -262,11 +262,13 @@ pgbench  options  dbname
 
 
  
-  -b scriptname
-  --builtin scriptname
+  -b scriptname[@weight]
+  --builtin=scriptname[@weight]
   

 Add the specified builtin script to the list of executed scripts.
+An optional integer weight after @ allows to adjust the
+probability of drawing the script.  If not specified, it is set to 1.
 Available builtin scripts are: tpcb-like,
 simple-update and select-only.
 Unambiguous prefixes of builtin names are accepted.
@@ -322,12 +324,14 @@ pgbench  options  dbname
  
 
  
-  -f filename
-  --file=filename
+  -f filename[@weight]
+  --file=filename[@weight]
   

 Add a transaction script read from filename to
 the list of executed scripts.
+An optional integer weight after @ allows to adjust the
+probability of drawing the test.
 See below for details.

   
@@ -687,9 +691,13 @@ pgbench  options  dbname
   What is the Transaction Actually Performed in pgbench?
 
   
-   Pgbench executes test scripts chosen randomly from a specified list.
+   pgbench executes test scripts chosen randomly
+   from a specified list.
They include built-in scripts with -b and
user-provided custom scripts with -f.
+   Each script may be given a relative weight specified after a
+   @ so as to change its drawing probability.
+   The default weight is 1.
  
 
   
@@ -1194,12 +1202,11 @@ number of clients: 10
 number of threads: 1
 number of transactions per client: 1000
 number of transactions actually processed: 1/1
+latency average = 15.844 ms
+latency stddev = 2.715 ms
 tps = 618.764555 (including connections establishing)
 tps = 622.977698 (excluding connections establishing)
-SQL script 1: builtin: TPC-B (sort of)
- - 1 transactions (100.0% of total, tps = 618.764555)
- - latency average = 15.844 ms
- - latency stddev = 2.715 ms
+script statistics:
  - statement latencies in milliseconds:
 0.004386\set nbranches 1 * :scale
 0.001343\set ntellers 10 * :scale
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 8b0b17a..c131681 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -38,6 +38,7 @@
 #include "portability/instr_time.h"
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -179,6 +180,8 @@ char	   *login = NULL;
 char	   *dbName;
 const char *progname;
 
+#define WSEP '@'/* weight separator */
+
 volatile bool timer_exceeded = false;	/* flag from signal handler */
 
 /* variable definitions */
@@ -300,23 +303,26 @@ typedef struct
 static struct
 {
 	const char *name;
+	int			weight;
 	Command   **commands;
 	StatsData stats;
 }	sql_script[MAX_SCRIPTS];	/* SQL script files */
 static int	num_scripts;		/* number of scripts in sql_script[] */
 static int	num_commands = 0;	/* total number of Command structs */
+static int64 total_weight = 0;
+
 static int	debug = 0;			/* debug flag */
 
 /* Define builtin test scripts */
-#define N_BUILTIN 3
-static struct
+typedef struct script_t
 {
 	char	   *name;			/* very short name for -b ... */
 	char	   *desc;			/* short description */
 	char	   *commands;		/* actual pgbench script */
-}
+} script_t;
 
-			builtin_script[] =
+#define N_BUILTIN 3
+static script_t builtin_script[] =
 {
 	{
 		"tpcb-like",
@@ -392,9 +398,9 @@ usage(void)
 	 "  --tablespace=TABLESPACE  create tables in the specified tablespace\n"
 		   "  --unlogged-tablescreate tables as unlogged tables\n"
 		   "\nOptions to select what to run:\n"
-		   "  -b, --builtin=NAME   add buitin script (use \"-b list\" to display\n"
-		   "   available scripts)\n"
-		   "  -f, --file=FILENAME  add transaction script from FILENAME\n"
+		   "  -b,

Re: [HACKERS] pgbench stats per script & other stuff

2016-03-03 Thread Alvaro Herrera

Michael Paquier wrote:
> On Tue, Feb 9, 2016 at 4:22 AM, Fabien COELHO  wrote:

> > Hmmm, I type them and I'm not so good with a keyboard, so "se" is better
> > than:
> >
> > "selct-onlyect-only".
> 
> I can understand that feeling.

Pushed 19-e, thanks.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2016-03-03 Thread Alvaro Herrera

I looked at 19.d and I think the design has gotten pretty convoluted.  I
think we could simplify with the following changes:

struct script_t gets a new member, of type Command **, which is
initially null.

function process_builtin receives the complete script_t (not individual
memebers of it) constructs the Command ** array and puts it in
script_t's new member; return value is the same script_t struct it got
(except it's now augmented with the Command **array).

function process_file constructs a new script_t from the string list,
creates its Command **array just like process_builtin and returns the
constructed struct.

function addScript receives script_t instead of individual members of
it, and does the appropriate thing.


Alternatively, we could have a different struct that's defined to carry
only the Command ** array (not the command string array) and is returned 
by both process_builtin and process_file.  Perhaps we could also put the
script weight in there.  With this arrangement we don't need to typedef
script_t at all and we can just keep it as an anonymous struct as today.

This is what I tried to describe earlier, but obviously I wasn't clear
enough.

Thanks,

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2016-02-08 Thread Michael Paquier

On Tue, Feb 9, 2016 at 4:22 AM, Fabien COELHO  wrote:
>> +   /* compute total_weight */
>> +   for (i = 0; i < num_scripts; i++)
>> +   {
>> +   total_weight += sql_script[i].weight;
>> +
>> +   /* detect overflow... */
>> If let as int64, you may want to remove this overflow check, or keep
>> it as int32.
>
>
> I'd rather keep int64, and remove the check.

OK, and you did so. Thanks.

>>> [JSON/YAML]
>>> If you want something else, it would help to provide a sample of what you
>>> expect.
>>
>> You could do that with an additional option here as well:
>> --output-format=normal|yamljson. The tastes of each user is different.
>
> I think that json/yaml-ifying pgbench output is beyond the object of this
> patch, so should be kept out?

Yeah, that's just a free idea that this set of patches does not need
to address. If someone thinks that's worth it, feel free to submit a
patch, perhaps we could add a TODO item on the wiki. Regarding the
output generated by your patch, I think that's fine. Perhaps Alvaro
has other thoughts on the matter. I don't know this part.

>>> Find attached a 18-d which addresses these concerns, and a actualized
>>> 18-e
>>> for the prefix.
>>
>>
>> (I could live without that personally)
>
> Hmmm, I type them and I'm not so good with a keyboard, so "se" is better
> than:
>
> "selct-onlyect-only".

I can understand that feeling.

>> -/* return builtin script "name", or fail if not found */
>> +/* return commands for selected builtin script, if unambiguous */
>> static script_t *
>> findBuiltin(const char *name)
>> This comment needs a refresh. This does not return a set of commands,
>> but the script itself.
>
> Indeed.
>
> Attached 19-d and 19-e.

+/* return builtin script "name", or fail if not found */
builtin does not sound like correct English to me, but built-in is.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2016-02-08 Thread Fabien COELHO



Hi Michaël,


Attached 19-d and 19-e.


+/* return builtin script "name", or fail if not found */
builtin does not sound like correct English to me, but built-in is.


I notice that "man bash" uses "builtin" extensively, so I think it is okay 
like that, but I would be fine as well with "built-in".


I suggest to let it as is unless some native speaker really requires 
"built-in", in which case there would be many places to update, so that 
would be another orthographic-oriented patch:-)


--
Fabien.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2016-02-08 Thread Michael Paquier

On Fri, Feb 5, 2016 at 12:53 AM, Fabien COELHO  wrote:
>> Something is wrong with patch d.  I noticed two things,
>> 1. the total_weight stuff can overflow,
>
> It can generate an error on overflow by checking the total_weight while it
> is being computed. I've switched total_weight to int64 so it is now really
> impossible to overflow with the 32 bit int_max limit on weight.

+   /* compute total_weight */
+   for (i = 0; i < num_scripts; i++)
+   {
+   total_weight += sql_script[i].weight;
+
+   /* detect overflow... */
+   if (total_weight < 0)
+   {
+   fprintf(stderr, "script weight overflow at
script %d\n", i+1);
+   exit(1);
+   }
+   }
If let as int64, you may want to remove this overflow check, or keep
it as int32.

>> 2. the chooseScript stuff is broken, or something.
>
> Sorry, probably a <=/< error. I think I've fixed it and I've simplified the
> code a little bit.

+   w = getrand(thread, 0, total_weight - 1);
+   do
+   {
+   w -= sql_script[i++].weight;
+   } while (w >= 0);
+
+   return i - 1;
This portion looks fine to me.

>> Another thing is that the "transaction type" output really deserves some
>> more work.  I think "multiple scripts" really doesn't cut it; we should
>> have some YAML-like as in the latency reports, which lists all scripts
>> in use and their weights.
>
> For me the current output is clear for the reader, which does not mean that
> it cannot be improve, but how is rather a matter of taste.
>
> I've tried to improve it further, see attached patch.
>
> If you want something else, it would help to provide a sample of what you
> expect.

You could do that with an additional option here as well:
--output-format=normal|yamljson. The tastes of each user is different.
>> Attached is my version of the patch.  While you're messing with it, it'd
>> be nice if you added comments on top of your recently added functions
>> such as findBuiltin, process_builtin, chooseScript.
> Why not.

const char *name;
+   int weight;
Command   **commands;
-   StatsData stats;
+   StatsData   stats;
Noise here?

> Find attached a 18-d which addresses these concerns, and a actualized 18-e
> for the prefix.

(I could live without that personally)

-/* return builtin script "name", or fail if not found */
+/* return commands for selected builtin script, if unambiguous */
 static script_t *
 findBuiltin(const char *name)
This comment needs a refresh. This does not return a set of commands,
but the script itself.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2016-02-08 Thread Fabien COELHO



Hello Michaël,


+   /* compute total_weight */
+   for (i = 0; i < num_scripts; i++)
+   {
+   total_weight += sql_script[i].weight;
+
+   /* detect overflow... */
If let as int64, you may want to remove this overflow check, or keep
it as int32.


I'd rather keep int64, and remove the check.


[JSON/YAML]
If you want something else, it would help to provide a sample of what you
expect.


You could do that with an additional option here as well:
--output-format=normal|yamljson. The tastes of each user is different.


I think that json/yaml-ifying pgbench output is beyond the object of this
patch, so should be kept out?


   const char *name;
+   int weight;
   Command   **commands;
-   StatsData stats;
+   StatsData   stats;
Noise here?


Indeed.


Find attached a 18-d which addresses these concerns, and a actualized 18-e
for the prefix.


(I could live without that personally)


Hmmm, I type them and I'm not so good with a keyboard, so "se" is better 
than:


"selct-onlyect-only".


-/* return builtin script "name", or fail if not found */
+/* return commands for selected builtin script, if unambiguous */
static script_t *
findBuiltin(const char *name)
This comment needs a refresh. This does not return a set of commands,
but the script itself.


Indeed.

Attached 19-d and 19-e.

--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index ade1b53..780a520 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -262,11 +262,13 @@ pgbench  options  dbname
 
 
  
-  -b scriptname
-  --builtin scriptname
+  -b scriptname[@weight]
+  --builtin=scriptname[@weight]
   

 Add the specified builtin script to the list of executed scripts.
+An optional integer weight after @ allows to adjust the
+probability of drawing the script.  If not specified, it is set to 1.
 Available builtin scripts are: tpcb-like,
 simple-update and select-only.
 With special name list, show the list of builtin scripts
@@ -321,12 +323,14 @@ pgbench  options  dbname
  
 
  
-  -f filename
-  --file=filename
+  -f filename[@weight]
+  --file=filename[@weight]
   

 Add a transaction script read from filename to
 the list of executed scripts.
+An optional integer weight after @ allows to adjust the
+probability of drawing the test.
 See below for details.

   
@@ -686,9 +690,13 @@ pgbench  options  dbname
   What is the Transaction Actually Performed in pgbench?
 
   
-   Pgbench executes test scripts chosen randomly from a specified list.
+   pgbench executes test scripts chosen randomly
+   from a specified list.
They include built-in scripts with -b and
user-provided custom scripts with -f.
+   Each script may be given a relative weight specified after a
+   @ so as to change its drawing probability.
+   The default weight is 1.
  
 
   
@@ -1135,12 +1143,11 @@ number of clients: 10
 number of threads: 1
 number of transactions per client: 1000
 number of transactions actually processed: 1/1
+latency average = 15.844 ms
+latency stddev = 2.715 ms
 tps = 618.764555 (including connections establishing)
 tps = 622.977698 (excluding connections establishing)
-SQL script 1: builtin: TPC-B (sort of)
- - 1 transactions (100.0% of total, tps = 618.764555)
- - latency average = 15.844 ms
- - latency stddev = 2.715 ms
+script statistics:
  - statement latencies in milliseconds:
 0.004386\set nbranches 1 * :scale
 0.001343\set ntellers 10 * :scale
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 7eb6a2d..a73e289 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -38,6 +38,7 @@
 #include "portability/instr_time.h"
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -179,6 +180,8 @@ char	   *login = NULL;
 char	   *dbName;
 const char *progname;
 
+#define WSEP '@'/* weight separator */
+
 volatile bool timer_exceeded = false;	/* flag from signal handler */
 
 /* variable definitions */
@@ -299,23 +302,26 @@ typedef struct
 static struct
 {
 	const char *name;
+	int			weight;
 	Command   **commands;
 	StatsData stats;
 }	sql_script[MAX_SCRIPTS];	/* SQL script files */
 static int	num_scripts;		/* number of scripts in sql_script[] */
 static int	num_commands = 0;	/* total number of Command structs */
+static int64 total_weight = 0;
+
 static int	debug = 0;			/* debug flag */
 
 /* Define builtin test scripts */
-#define N_BUILTIN 3
-static struct
+typedef struct script_t
 {
 	char	   *name;			/* very short name for -b ... */
 	char	   *desc;			/* short description */
 	char	   *commands;		/* actual pgbench script */
-}
+} script_t;
 
-			builtin_script[] =
+#define N_BUILTIN 3
+static script_t builtin_script[] =
 {
 	{
 		"tpcb-like",
@@

Re: [HACKERS] pgbench stats per script & other stuff

2016-02-08 Thread Alvaro Herrera

I closed this one as "committed", since we pushed a bunch of parts.
Please submit the two remaining ones to the next commitfest.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2016-02-04 Thread Fabien COELHO



Hello Alvaro,


Something is wrong with patch d.  I noticed two things,

1. the total_weight stuff can overflow,


It can generate an error on overflow by checking the total_weight while 
it is being computed. I've switched total_weight to int64 so it is now 
really impossible to overflow with the 32 bit int_max limit on weight.



2. the chooseScript stuff is broken, or something.


Sorry, probably a <=/< error. I think I've fixed it and I've simplified 
the code a little bit.



Another thing is that the "transaction type" output really deserves some
more work.  I think "multiple scripts" really doesn't cut it; we should
have some YAML-like as in the latency reports, which lists all scripts
in use and their weights.


For me the current output is clear for the reader, which does not 
mean that it cannot be improve, but how is rather a matter of taste.


I've tried to improve it further, see attached patch.

If you want something else, it would help to provide a sample of what you 
expect.



Also, while I have your attention regarding accumulated "technical
debt", please have a look at the "desc" argument used in addScript etc.
It's pretty ridiculous currently.  Maybe findBuiltin / process_builtin /
process_file should return a struct containing Command ** and the
"desc" string, rather than passing desc as a separate argument.


Ok, it can return a pointer to the builtin script.


Attached is my version of the patch.  While you're messing with it, it'd
be nice if you added comments on top of your recently added functions
such as findBuiltin, process_builtin, chooseScript.


Why not.

Find attached a 18-d which addresses these concerns, and a actualized 18-e 
for the prefix.


--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index ade1b53..780a520 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -262,11 +262,13 @@ pgbench  options  dbname
 
 
  
-  -b scriptname
-  --builtin scriptname
+  -b scriptname[@weight]
+  --builtin=scriptname[@weight]
   

 Add the specified builtin script to the list of executed scripts.
+An optional integer weight after @ allows to adjust the
+probability of drawing the script.  If not specified, it is set to 1.
 Available builtin scripts are: tpcb-like,
 simple-update and select-only.
 With special name list, show the list of builtin scripts
@@ -321,12 +323,14 @@ pgbench  options  dbname
  
 
  
-  -f filename
-  --file=filename
+  -f filename[@weight]
+  --file=filename[@weight]
   

 Add a transaction script read from filename to
 the list of executed scripts.
+An optional integer weight after @ allows to adjust the
+probability of drawing the test.
 See below for details.

   
@@ -686,9 +690,13 @@ pgbench  options  dbname
   What is the Transaction Actually Performed in pgbench?
 
   
-   Pgbench executes test scripts chosen randomly from a specified list.
+   pgbench executes test scripts chosen randomly
+   from a specified list.
They include built-in scripts with -b and
user-provided custom scripts with -f.
+   Each script may be given a relative weight specified after a
+   @ so as to change its drawing probability.
+   The default weight is 1.
  
 
   
@@ -1135,12 +1143,11 @@ number of clients: 10
 number of threads: 1
 number of transactions per client: 1000
 number of transactions actually processed: 1/1
+latency average = 15.844 ms
+latency stddev = 2.715 ms
 tps = 618.764555 (including connections establishing)
 tps = 622.977698 (excluding connections establishing)
-SQL script 1: builtin: TPC-B (sort of)
- - 1 transactions (100.0% of total, tps = 618.764555)
- - latency average = 15.844 ms
- - latency stddev = 2.715 ms
+script statistics:
  - statement latencies in milliseconds:
 0.004386\set nbranches 1 * :scale
 0.001343\set ntellers 10 * :scale
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 7eb6a2d..da4f05c 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -38,6 +38,7 @@
 #include "portability/instr_time.h"
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -179,6 +180,8 @@ char	   *login = NULL;
 char	   *dbName;
 const char *progname;
 
+#define WSEP '@'/* weight separator */
+
 volatile bool timer_exceeded = false;	/* flag from signal handler */
 
 /* variable definitions */
@@ -299,23 +302,26 @@ typedef struct
 static struct
 {
 	const char *name;
+	int			weight;
 	Command   **commands;
-	StatsData stats;
+	StatsData	stats;
 }	sql_script[MAX_SCRIPTS];	/* SQL script files */
 static int	num_scripts;		/* number of scripts in sql_script[] */
 static int	num_commands = 0;	/* total number of Command structs */
+static int64 total_weight = 0;
+
 static int	debug = 0;			/* debug flag */
 
 /* Define builtin test

Re: [HACKERS] pgbench stats per script & other stuff

2016-02-01 Thread Fabien COELHO



Hello Michaël,

Two rebase attached.


 - 15-e: prefix selection for -b



-   if (strncmp(builtin_script[i].name, name,
-   strlen(builtin_script[i].name)) == 0)
+   if (strncmp(builtin_script[i].name, name, len) == 0)

I agree with Alvaro here: this should remain unchanged. It seems to be
a rebase mistake.


I do not understand. I tested it and it works as expected. If I put the 
above strlen instead the suffix detection does not work:



 fabien@sto:bin/pgbench> git diff
 diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
 index 783cbf9..0c33012 100644
 --- a/src/bin/pgbench/pgbench.c
 +++ b/src/bin/pgbench/pgbench.c
 @@ -2684,7 +2684,7 @@ findBuiltin(const char *name, char **desc)

 for (i = 0; i < N_BUILTIN; i++)
 {
 -   if (strncmp(builtin_script[i].name, name, len) == 0)
 +   if (strncmp(builtin_script[i].name, name, 
strlen(builtin_script[i].name)) == 0)
 {
 *desc = builtin_script[i].desc;
 commands = builtin_script[i].commands;


 ./pgbench -b t
 no builtin script found for name "t"
 Available builtin scripts:
 tpcb-like
 simple-update
 select-only

Indeed, then it can only match if the provided "name" is as long as the 
recorded len. The point of the "suffix" selection is to align to the short 
supplied string.


--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index ade1b53..ca3e158 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -262,11 +262,13 @@ pgbench  options  dbname
 
 
  
-  -b scriptname
-  --builtin scriptname
+  -b scriptname[@weight]
+  --builtin scriptname[@weight]
   

 Add the specified builtin script to the list of executed scripts.
+An optional integer weight after @ allows to adjust the
+probability of drawing the test.
 Available builtin scripts are: tpcb-like,
 simple-update and select-only.
 With special name list, show the list of builtin scripts
@@ -321,12 +323,14 @@ pgbench  options  dbname
  
 
  
-  -f filename
-  --file=filename
+  -f filename[@weight]
+  --file=filename[@weight]
   

 Add a transaction script read from filename to
 the list of executed scripts.
+An optional integer weight after @ allows to adjust the
+probability of drawing the test.
 See below for details.

   
@@ -689,6 +693,9 @@ pgbench  options  dbname
Pgbench executes test scripts chosen randomly from a specified list.
They include built-in scripts with -b and
user-provided custom scripts with -f.
+   Each script may be given a relative weight specified after a
+   @ so as to change its drawing probability.
+   The default weight is 1.
  
 
   
@@ -1137,7 +1144,7 @@ number of transactions per client: 1000
 number of transactions actually processed: 1/1
 tps = 618.764555 (including connections establishing)
 tps = 622.977698 (excluding connections establishing)
-SQL script 1: builtin: TPC-B (sort of)
+SQL script 1, weight 1: builtin: TPC-B (sort of)
  - 1 transactions (100.0% of total, tps = 618.764555)
  - latency average = 15.844 ms
  - latency stddev = 2.715 ms
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 7eb6a2d..5d24768 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -179,6 +179,8 @@ char	   *login = NULL;
 char	   *dbName;
 const char *progname;
 
+#define WSEP '@' /* weight separator */
+
 volatile bool timer_exceeded = false;	/* flag from signal handler */
 
 /* variable definitions */
@@ -299,11 +301,14 @@ typedef struct
 static struct
 {
 	const char *name;
+	int weight;
 	Command   **commands;
 	StatsData stats;
 }	sql_script[MAX_SCRIPTS];	/* SQL script files */
 static int	num_scripts;		/* number of scripts in sql_script[] */
 static int	num_commands = 0;	/* total number of Command structs */
+static int  total_weight = 0;
+
 static int	debug = 0;			/* debug flag */
 
 /* Define builtin test scripts */
@@ -389,9 +394,9 @@ usage(void)
 	 "  --tablespace=TABLESPACE  create tables in the specified tablespace\n"
 		   "  --unlogged-tablescreate tables as unlogged tables\n"
 		   "\nOptions to select what to run:\n"
-		   "  -b, --builtin=NAME   add buitin script (use \"-b list\" to display\n"
-		   "   available scripts)\n"
-		   "  -f, --file=FILENAME  add transaction script from FILENAME\n"
+		   "  -b, --builtin=NAME[@W]   add weighted buitin script (use \"-b list\"\n"
+		   "   to display available scripts)\n"
+		   "  -f, --file=FILENAME[@W]  add weighted transaction script from FILENAME\n"
 		   "  -N, --skip-some-updates  skip updates of pgbench_tellers and pgbench_branches\n"
 		   "

Re: [HACKERS] pgbench stats per script & other stuff

2016-01-31 Thread Michael Paquier

On Fri, Jan 29, 2016 at 11:28 PM, Fabien COELHO  wrote:
> Here is a rebase of the 3 remaining parts:
>  - 15-c: per script stats
>  - 15-d: weighted scripts
>  - 15-e: prefix selection for -b

Regarding patch d.
+   /* compute total_weight */
+   for (i = 0; i < num_scripts; i++)
+   total_weight += sql_script[i].weight;
total_weight can overflow :) I don't think that's worth worrying, I am
just noticing that.

+The provided scriptname needs only be an unambiguous
+prefix of the builtin name, hence si would be enough to
+select simple-update.
[...]
-   if (strncmp(builtin_script[i].name, name,
-   strlen(builtin_script[i].name)) == 0)
+   if (strncmp(builtin_script[i].name, name, len) == 0)
I agree with Alvaro here: this should remain unchanged. It seems to be
a rebase mistake.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2016-01-29 Thread Fabien COELHO



Hello Alvaro,

Thanks for the progress!


I pushed this, along with a few more tweaks, mostly adding comments and
moving functions so that related things are together.  I hope I didn't
break anything.


Looks ok.

Here is a rebase of the 3 remaining parts:
 - 15-c: per script stats
 - 15-d: weighted scripts
 - 15-e: prefix selection for -b

--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 42d0667..ade1b53 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -1138,6 +1138,9 @@ number of transactions actually processed: 1/1
 tps = 618.764555 (including connections establishing)
 tps = 622.977698 (excluding connections establishing)
 SQL script 1: builtin: TPC-B (sort of)
+ - 1 transactions (100.0% of total, tps = 618.764555)
+ - latency average = 15.844 ms
+ - latency stddev = 2.715 ms
  - statement latencies in milliseconds:
 0.004386\set nbranches 1 * :scale
 0.001343\set ntellers 10 * :scale
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 44da3d1..64c7a6c 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -164,6 +164,7 @@ bool		use_log;			/* log transaction latencies to a file */
 bool		use_quiet;			/* quiet logging onto stderr */
 int			agg_interval;		/* log aggregates instead of individual
  * transactions */
+boolper_script_stats = false; /* whether to collect stats per script */
 int			progress = 0;		/* thread progress report every this seconds */
 bool		progress_timestamp = false; /* progress report with Unix time */
 int			nclients = 1;		/* number of clients */
@@ -299,6 +300,7 @@ static struct
 {
 	const char *name;
 	Command   **commands;
+	StatsData stats;
 }	sql_script[MAX_SCRIPTS];	/* SQL script files */
 static int	num_scripts;		/* number of scripts in sql_script[] */
 static int	num_commands = 0;	/* total number of Command structs */
@@ -1326,7 +1328,7 @@ top:
 		/* transaction finished: calculate latency and log the transaction */
 		if (commands[st->state + 1] == NULL)
 		{
-			if (progress || throttle_delay || latency_limit || logfile)
+			if (progress || throttle_delay || latency_limit || per_script_stats || logfile)
 processXactStats(thread, st, , false, logfile, agg);
 			else
 thread->stats.cnt++;
@@ -1419,7 +1421,7 @@ top:
 	}
 
 	/* Record transaction start time under logging, progress or throttling */
-	if ((logfile || progress || throttle_delay || latency_limit) && st->state == 0)
+	if ((logfile || progress || throttle_delay || latency_limit || per_script_stats) && st->state == 0)
 	{
 		INSTR_TIME_SET_CURRENT(st->txn_begin);
 
@@ -1872,6 +1874,9 @@ processXactStats(TState *thread, CState *st, instr_time *now,
 
 	if (use_log)
 		doLog(thread, st, logfile, now, agg, skipped, latency, lag);
+
+	if (per_script_stats) /* mutex? hmmm... these are only statistics */
+		accumStats(& sql_script[st->use_file].stats, skipped, latency, lag);
 }
 
 
@@ -2661,6 +2666,7 @@ addScript(const char *name, Command **commands)
 
 	sql_script[num_scripts].name = name;
 	sql_script[num_scripts].commands = commands;
+	initStats(& sql_script[num_scripts].stats, 0.0);
 	num_scripts++;
 }
 
@@ -2744,22 +2750,40 @@ printResults(TState *threads, StatsData *total, instr_time total_time,
 	printf("tps = %f (including connections establishing)\n", tps_include);
 	printf("tps = %f (excluding connections establishing)\n", tps_exclude);
 
-	/* Report per-command latencies */
-	if (is_latencies)
+	/* Report per-script stats */
+	if (per_script_stats)
 	{
 		int			i;
 
 		for (i = 0; i < num_scripts; i++)
 		{
-			Command   **commands;
+			printf("SQL script %d: %s\n"
+   " - "INT64_FORMAT" transactions (%.1f%% of total, tps = %f)\n",
+   i + 1, sql_script[i].name,
+   sql_script[i].stats.cnt,
+   100.0 * sql_script[i].stats.cnt / total->cnt,
+   sql_script[i].stats.cnt / time_include);
 
-			printf("SQL script %d: %s\n", i + 1, sql_script[i].name);
-			printf(" - statement latencies in milliseconds:\n");
+			if (latency_limit)
+printf(" - number of transactions skipped: "INT64_FORMAT" (%.3f%%)\n",
+	   sql_script[i].stats.skipped,
+	   100.0 * sql_script[i].stats.skipped /
+	   (sql_script[i].stats.skipped + sql_script[i].stats.cnt));
 
-			for (commands = sql_script[i].commands; *commands != NULL; commands++)
-printf("   %11.3f  %s\n",
-  1000.0 * (*commands)->stats.sum / (*commands)->stats.count,
-	   (*commands)->line);
+			printSimpleStats(" - latency", & sql_script[i].stats.latency);
+
+			/* Report per-command latencies */
+			if (is_latencies)
+			{
+Command ** com;
+
+printf(" - statement latencies in milliseconds:\n");
+
+for (com = sql_script[i].commands; *com != NULL; com++)
+	printf("   %11.3f  %s\n",
+		   1000.0 * (*com)->stats.sum / (*com)->stats.count,
+		   (*com)->line);
+			}
 		}
 	}
 }
@@ -2945,6 +2969,7 @@ main(int argc, char **argv)

Re: [HACKERS] pgbench stats per script & other stuff

2016-01-29 Thread Alvaro Herrera

Fabien COELHO wrote:

> The answer is essentially yes, the field is needed for the "aggregated" mode
> where this specific behavior is used.

OK, thanks, that looks better to me.

Can you now appreciate why I asked for split patches?  If I had to go
over the big patch I probably wouldn't have been able to read through
each to make sense of it.

I pushed this, along with a few more tweaks, mostly adding comments and
moving functions so that related things are together.  I hope I didn't
break anything.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2016-01-27 Thread Fabien COELHO



Hello again,


Obviously this would work. I did not think the special case was worth the
extra argument. This one has some oddity too, because the second argument is
ignored depending on the third. Do as you feel.


Actually my question was whether keeping the original start_time was the
intended design.


Sorry I misunderstood the question.

The answer is essentially yes, the field is needed for the "aggregated" 
mode where this specific behavior is used.


However, after some look at the code I think that it is possible to do 
without.


I also spotted an small issue under low tps where the last aggregation was 
not shown.


With the attached version these problems have been removed, no conditional 
initialization. There is also a small diff with the version you sent.


--
Fabien.diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index d5f242c..b3fe994 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -166,10 +166,8 @@ int			agg_interval;		/* log aggregates instead of individual
  * transactions */
 int			progress = 0;		/* thread progress report every this seconds */
 bool		progress_timestamp = false; /* progress report with Unix time */
-int			progress_nclients = 0;		/* number of clients for progress
-		 * report */
-int			progress_nthreads = 0;		/* number of threads for progress
-		 * report */
+int			nclients = 1;		/* number of clients */
+int			nthreads = 1;		/* number of threads */
 bool		is_connect;			/* establish connection for each transaction */
 bool		is_latencies;		/* report per-command latencies */
 int			main_pid;			/* main process id used in log filename */
@@ -193,6 +191,35 @@ typedef struct
 #define SHELL_COMMAND_SIZE	256 /* maximum size allowed for shell command */
 
 /*
+ * Simple data structure to keep stats about something.
+ *
+ * XXX probably the first value should be kept and used as an offset for
+ * better numerical stability...
+ */
+typedef struct
+{
+	int64		count;			/* how many values were encountered */
+	double		min;			/* the minimum seen */
+	double		max;			/* the maximum seen */
+	double		sum;			/* sum of values */
+	double		sum2;			/* sum of squared values */
+} SimpleStats;
+
+/*
+ * Data structure to hold various statistics, used for interval statistics as
+ * well as file statistics.
+ */
+typedef struct
+{
+	long		start_time;		/* interval start time, for aggregates */
+	int64		cnt;			/* number of transactions */
+	int64		skipped;		/* number of transactions skipped under --rate
+ * and --latency-limit */
+	SimpleStats latency;
+	SimpleStats lag;
+} StatsData;
+
+/*
  * Connection state
  */
 typedef struct
@@ -213,10 +240,8 @@ typedef struct
 	bool		prepared[MAX_SCRIPTS];	/* whether client prepared the script */
 
 	/* per client collected stats */
-	int			cnt;			/* xacts count */
+	int64		cnt;			/* transaction count */
 	int			ecnt;			/* error count */
-	int64		txn_latencies;	/* cumulated latencies */
-	int64		txn_sqlats;		/* cumulated square latencies */
 } CState;
 
 /*
@@ -228,19 +253,14 @@ typedef struct
 	pthread_t	thread;			/* thread handle */
 	CState	   *state;			/* array of CState */
 	int			nstate;			/* length of state[] */
-	instr_time	start_time;		/* thread start time */
-	instr_time *exec_elapsed;	/* time spent executing cmds (per Command) */
-	int		   *exec_count;		/* number of cmd executions (per Command) */
 	unsigned short random_state[3];		/* separate randomness for each thread */
 	int64		throttle_trigger;		/* previous/next throttling (us) */
 
 	/* per thread collected stats */
+	instr_time	start_time;		/* thread start time */
 	instr_time	conn_time;
-	int64		throttle_lag;	/* total transaction lag behind throttling */
-	int64		throttle_lag_max;		/* max transaction lag */
-	int64		throttle_latency_skipped;		/* lagging transactions
- * skipped */
-	int64		latency_late;	/* late transactions */
+	StatsData	stats;
+	int64		latency_late;	/* executed but late transactions */
 } TState;
 
 #define INVALID_THREAD		((pthread_t) 0)
@@ -272,33 +292,14 @@ typedef struct
 	char	   *argv[MAX_ARGS]; /* command word list */
 	int			cols[MAX_ARGS]; /* corresponding column starting from 1 */
 	PgBenchExpr *expr;			/* parsed expression */
+	SimpleStats stats;			/* time spent in this command */
 } Command;
 
-typedef struct
-{
-
-	long		start_time;		/* when does the interval start */
-	int			cnt;			/* number of transactions */
-	int			skipped;		/* number of transactions skipped under --rate
- * and --latency-limit */
-
-	double		min_latency;	/* min/max latencies */
-	double		max_latency;
-	double		sum_latency;	/* sum(latency), sum(latency^2) - for
- * estimates */
-	double		sum2_latency;
-
-	double		min_lag;
-	double		max_lag;
-	double		sum_lag;		/* sum(lag) */
-	double		sum2_lag;		/* sum(lag*lag) */
-} AggVals;
-
 static struct
 {
 	const char *name;
-	Command	 **commands;
-} sql_script[MAX_SCRIPTS];		/* SQL script files */
+	Command   **commands;
+}

Re: [HACKERS] pgbench stats per script & other stuff

2016-01-27 Thread Alvaro Herrera

Fabien COELHO wrote:

> >It seems a bit funny to have the start_time not be reset when 0.0 is
> >passed, which is almost all the callers.  Using a float as a boolean
> >looks pretty odd; is that kosher?  Maybe it'd be a good idea to have a
> >separate boolean flag instead?

> Obviously this would work. I did not think the special case was worth the
> extra argument. This one has some oddity too, because the second argument is
> ignored depending on the third. Do as you feel.

Actually my question was whether keeping the original start_time was the
intended design.  I think some places are okay with keeping the original
value, but the ones in addScript, the per-thread loop in main(), and the
global one also in main() should all be getting a 0.0 instead of leaving
the value uninitialized.

(I did turn the arguments around so that the bool is second and the
float is third.  Thanks for the suggestion.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2016-01-27 Thread Fabien COELHO



Hello again,


Here's part b rebased, pgindented and with some minor additional tweaks
(mostly function commands and the function renames I mentioned).


Patch looks ok to me, various tests where ok as well.


Still concerned about the unlocked stat accums.


See my arguments in other mail. I can add a lock if this is a blocker, but 
I think that it is actually better without, because of quantum: the 
measuring process should avoid affecting the measured data, and locking is 
not cheap.



I haven't tried to rebase the other ones yet, they need manual conflict
fixes.


Find attached 14-c/d/e rebased patches.

About e, for some obscure reason I failed in my initial attempt at 
inserting the misplaced options in their rightfull position in the option 
list. Sorry for the noise.


--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 42d0667..ade1b53 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -1138,6 +1138,9 @@ number of transactions actually processed: 1/1
 tps = 618.764555 (including connections establishing)
 tps = 622.977698 (excluding connections establishing)
 SQL script 1: builtin: TPC-B (sort of)
+ - 1 transactions (100.0% of total, tps = 618.764555)
+ - latency average = 15.844 ms
+ - latency stddev = 2.715 ms
  - statement latencies in milliseconds:
 0.004386\set nbranches 1 * :scale
 0.001343\set ntellers 10 * :scale
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 305c319..5594d1c 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -164,6 +164,7 @@ bool		use_log;			/* log transaction latencies to a file */
 bool		use_quiet;			/* quiet logging onto stderr */
 int			agg_interval;		/* log aggregates instead of individual
  * transactions */
+boolper_script_stats = false; /* whether to collect stats per script */
 int			progress = 0;		/* thread progress report every this seconds */
 bool		progress_timestamp = false; /* progress report with Unix time */
 int			nclients = 1;		/* number of clients */
@@ -299,6 +300,7 @@ static struct
 {
 	const char *name;
 	Command   **commands;
+	StatsData stats;
 }	sql_script[MAX_SCRIPTS];	/* SQL script files */
 static int	num_scripts;		/* number of scripts in sql_script[] */
 static int	num_commands = 0;	/* total number of Command structs */
@@ -1308,7 +1310,7 @@ top:
 		/* transaction finished: calculate latency and log the transaction */
 		if (commands[st->state + 1] == NULL)
 		{
-			if (progress || throttle_delay || latency_limit || logfile)
+			if (progress || throttle_delay || latency_limit || per_script_stats || logfile)
 doTxStats(thread, st, , false, logfile, agg);
 			else
 thread->stats.cnt++;
@@ -1401,7 +1403,7 @@ top:
 	}
 
 	/* Record transaction start time under logging, progress or throttling */
-	if ((logfile || progress || throttle_delay || latency_limit) && st->state == 0)
+	if ((logfile || progress || throttle_delay || latency_limit || per_script_stats) && st->state == 0)
 	{
 		INSTR_TIME_SET_CURRENT(st->txn_begin);
 
@@ -1889,6 +1891,9 @@ doTxStats(TState *thread, CState *st, instr_time *now,
 
 	if (use_log)
 		doLog(thread, st, logfile, now, agg, skipped, latency, lag);
+
+	if (per_script_stats) /* mutex? hmmm... these are only statistics */
+		doStats(& sql_script[st->use_file].stats, skipped, latency, lag);
 }
 
 
@@ -2678,6 +2683,7 @@ addScript(const char *name, Command **commands)
 
 	sql_script[num_scripts].name = name;
 	sql_script[num_scripts].commands = commands;
+	initStats(& sql_script[num_scripts].stats, 0.0);
 	num_scripts++;
 }
 
@@ -2761,22 +2767,40 @@ printResults(TState *threads, StatsData *total, instr_time total_time,
 	printf("tps = %f (including connections establishing)\n", tps_include);
 	printf("tps = %f (excluding connections establishing)\n", tps_exclude);
 
-	/* Report per-command latencies */
-	if (is_latencies)
+	/* Report per-script stats */
+	if (per_script_stats)
 	{
 		int			i;
 
 		for (i = 0; i < num_scripts; i++)
 		{
-			Command   **commands;
+			printf("SQL script %d: %s\n"
+   " - "INT64_FORMAT" transactions (%.1f%% of total, tps = %f)\n",
+   i+1, sql_script[i].name,
+   sql_script[i].stats.cnt,
+   100.0 * sql_script[i].stats.cnt / total->cnt,
+   sql_script[i].stats.cnt / time_include);
 
-			printf("SQL script %d: %s\n", i + 1, sql_script[i].name);
-			printf(" - statement latencies in milliseconds:\n");
+			if (latency_limit)
+printf(" - number of transactions skipped: "INT64_FORMAT" (%.3f%%)\n",
+	   sql_script[i].stats.skipped,
+	   100.0 * sql_script[i].stats.skipped /
+	   (sql_script[i].stats.skipped + sql_script[i].stats.cnt));
 
-			for (commands = sql_script[i].commands; *commands != NULL; commands++)
-printf("   %11.3f  %s\n",
-   1000.0 * (*commands)->stats.sum / (*commands)->stats.count,
-	   (*commands)->line);
+			printSimpleStats(" - latency", &

Re: [HACKERS] pgbench stats per script & other stuff

2016-01-27 Thread Fabien COELHO



Hello again,


If you want to implement real non-ambiguous-prefix code (i.e. have "se"
for "select-only", but reject "s" as ambiguous) be my guest.


I'm fine with filtering out ambiguous cases (i.e. just the "s" case). 
Attached a small patch for that.


--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 42d0667..124e70d 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -269,6 +269,9 @@ pgbench  options  dbname
 Add the specified builtin script to the list of executed scripts.
 Available builtin scripts are: tpcb-like,
 simple-update and select-only.
+The provided scriptname needs only be an unambiguous
+prefix of the builtin name, hence si would be enough to
+select simple-update.
 With special name list, show the list of builtin scripts
 and exit immediately.

diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index d5f242c..6350948 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -2649,22 +2649,32 @@ listAvailableScripts(void)
 	fprintf(stderr, "\n");
 }
 
+/* return commands for selected builtin script, if unambiguous */
 static char *
 findBuiltin(const char *name, char **desc)
 {
-	int			i;
+	int			i, found = 0, len = strlen(name);
+	char	   *commands = NULL;
 
 	for (i = 0; i < N_BUILTIN; i++)
 	{
-		if (strncmp(builtin_script[i].name, name,
-	strlen(builtin_script[i].name)) == 0)
+		if (strncmp(builtin_script[i].name, name, len) == 0)
 		{
 			*desc = builtin_script[i].desc;
-			return builtin_script[i].commands;
+			commands = builtin_script[i].commands;
+			found++;
 		}
 	}
 
-	fprintf(stderr, "no builtin script found for name \"%s\"\n", name);
+	if (found == 1)
+		return commands;
+
+	/* error cases */
+	if (found == 0)
+		fprintf(stderr, "no builtin script found for name \"%s\"\n", name);
+	else /* found > 1 */
+		fprintf(stderr,
+"%d builtin scripts found for prefix \"%s\"\n", found, name);
 	listAvailableScripts();
 	exit(1);
 }

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2016-01-27 Thread Fabien COELHO



Hello Alvaro,


I'm not really sure about the fact that we operate on those Stats
structs without locking.  I see upthread you convinced Michael that it
was okay, but is it really?  How severe is the damage if two threads
happen to collide?


For stats shared among threads, when it occurs one data about one 
transaction is not counted.


On the risk side: the collision probability is pretty low because the time 
to update a value is a "few" cycles, and the time to execute a transaction 
is typically in ms: I think under 1/10,000,000 data could be lost.


On the advantageous side: locking costs significant time thus would impact 
performance, I think that the measured performance loss because the 
occasional transaction data is not counted is lower that the performance 
loss due to systematically locking.


So for me this is really a low risk trade-off.


[...]
It seems a bit funny to have the start_time not be reset when 0.0 is
passed, which is almost all the callers.  Using a float as a boolean
looks pretty odd; is that kosher?  Maybe it'd be a good idea to have a
separate boolean flag instead?  Something like this

/*
* Initialize a StatsData struct to all zeroes.  Use the given
* start_time only if reset_start_time, otherwise keep the original
* value.
*/
static void
initStats(StatsData *sd, double start_time, bool reset_start_time)
{
sd->cnt = 0;
sd->skipped = 0;
initSimpleStats(>latency);
initSimpleStats(>lag);

/* not necessarily overriden? */
if (reset_start_time)
sd->start_time = start_time;
}


Obviously this would work. I did not think the special case was worth the 
extra argument. This one has some oddity too, because the second argument 
is ignored depending on the third. Do as you feel.



I renamed a couple of your functionettes, for instance doSimpleStats to
addToSimpleStats and appendSimpleStats to mergeSimpleStats.


Fine with me.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2016-01-26 Thread Alvaro Herrera

Fabien COELHO wrote:

> You know how delighted I am to split patches...

Yes, of course, it's the most interesting task in the world.  I'm fully
aware of that.

FWIW I'm going to apply a preliminary commit to pgindent-clean the file
before your patches, then apply each patch as pgindent-clean.  Otherwise
your whitespace style was getting too much on my nerves.

> b) refactor statistics collections (per thread, per command, per whatever)
>so as to use the same structure everywhere, reducing the CLOC by 115.
>this enables the next small patch which can reuse the new functions.

I'm not really sure about the fact that we operate on those Stats
structs without locking.  I see upthread you convinced Michael that it
was okay, but is it really?  How severe is the damage if two threads
happen to collide?

Why is this function defined like this?

/*
 * Initialize a StatsData struct to all zeroes, but the given
 * start_time, except that if it's exactly zero don't change it.
 */
static void
initStats(StatsData *sd, double start_time)
{
sd->cnt = 0;
sd->skipped = 0;
initSimpleStats(>latency);
initSimpleStats(>lag);

/* not necessarily overriden? */
if (start_time)
sd->start_time = start_time;
}

It seems a bit funny to have the start_time not be reset when 0.0 is
passed, which is almost all the callers.  Using a float as a boolean
looks pretty odd; is that kosher?  Maybe it'd be a good idea to have a
separate boolean flag instead?  Something like this

/*
 * Initialize a StatsData struct to all zeroes.  Use the given
 * start_time only if reset_start_time, otherwise keep the original
 * value.
 */
static void
initStats(StatsData *sd, double start_time, bool reset_start_time)
{
sd->cnt = 0;
sd->skipped = 0;
initSimpleStats(>latency);
initSimpleStats(>lag);

/* not necessarily overriden? */
if (reset_start_time)
sd->start_time = start_time;
}

I renamed a couple of your functionettes, for instance doSimpleStats to
addToSimpleStats and appendSimpleStats to mergeSimpleStats.

Haven't looked at patches c or d yet.  I'm tempted to thrown patch e in
with the initial pgindent run.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2016-01-26 Thread Alvaro Herrera

Fabien COELHO wrote:

> a) add -b option for cumulating builtins and rework internal script
>management so that builtin and external scripts are managed the
>same way.

I tweaked this a bit.  I found a bug in threadRun: it was reading the
commands first, and setting st->use_file later.  This led to the wrong
commands being read.

Some other less interesting changes:

* made chooseScript have the logic to react to single existing script;
no need to inject ternary operators in each caller to check for that
condition.

* Added a debug line every time a script is chosen,
+   if (debug)
+   fprintf(stderr, "client %d executing script \"%s\"\n", st->id,
+   sql_script[st->use_file].name);
  (I'd have liked to have chooseScript itself do it, but it doesn't
  have the script name handy.  Maybe this indicates that the data
  structures are slightly wrong.)

* Added a separate routine to list available scripts; originally that
was duplicated in "-b list" and when -b got an invalid script name.

* In usage(), I split out the options to select a script instead of
mixing them within "Benchmarking options"; also changed wording of
parenthical comment, no longer carrying the full list of scripts (a
choice which also omitted "-b list" itself):
+  "\nOptions to select what to run:\n"
+  "  -b, --builtin=NAME   add buitin script (use \"-b list\" to 
display\n"
+  "   available scripts)\n"
+  "  -f, --file=FILENAME  add transaction script from FILENAME\n"
+  "  -N, --skip-some-updates  skip updates of pgbench_tellers and 
pgbench_branches\n"
+  "   (same as \"-b simple-update\")\n"
+  "  -S, --select-onlyperform SELECT-only transactions\n"
+  "   (same as \"-b select-only\")\n"

I couldn't find a better heading to use there, so that'll have to do
unless someone has a better idea.

Some other trivial changes.  Patch attached.  I plan to push this as
soon as I'm able.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 541d17b..fdd3331 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -261,6 +261,23 @@ pgbench  options  dbname
 benchmarking arguments:
 
 
+ 
+  -b scriptname
+  --builtin scriptname
+  
+   
+Add the specified builtin script to the list of executed scripts.
+Available builtin scripts are: tpcb-like,
+simple-update and select-only.
+The provided scriptname needs only to be a prefix
+of the builtin name, hence simp would be enough to select
+simple-update.
+With special name list, show the list of builtin scripts
+and exit immediately.
+   
+  
+ 
+
 
  
   -c clients
@@ -307,14 +324,13 @@ pgbench  options  dbname
  
 
  
-  -f filename
-  --file=filename
+  -f filename
+  --file=filename
   

-Read transaction script from filename.
+Add a transaction script read from filename to
+the list of executed scripts.
 See below for details.
--N, -S, and -f
-are mutually exclusive.

   
  
@@ -404,10 +420,8 @@ pgbench  options  dbname
   --skip-some-updates
   

-Do not update pgbench_tellers and
-pgbench_branches.
-This will avoid update contention on these tables, but
-it makes the test case even less like TPC-B.
+Run builtin simple-update script.
+Shorthand for -b simple-update.

   
  
@@ -512,9 +526,9 @@ pgbench  options  dbname
 Report the specified scale factor in pgbench's
 output.  With the built-in tests, this is not necessary; the
 correct scale factor will be detected by counting the number of
-rows in the pgbench_branches table.  However, when testing
-custom benchmarks (-f option), the scale factor
-will be reported as 1 unless this option is used.
+rows in the pgbench_branches table.
+However, when testing only custom benchmarks (-f option),
+the scale factor will be reported as 1 unless this option is used.

   
  
@@ -524,7 +538,8 @@ pgbench  options  dbname
   --select-only
   

-Perform select-only transactions instead of TPC-B-like test.
+Run built-in select-only script.
+Shorthand for -b select-only.

   
  
@@ -674,7 +689,17 @@ pgbench  options  dbname
   What is the Transaction Actually Performed in pgbench?
 
   
-   The default transaction script issues seven commands per transaction:
+   Pgbench executes test scripts chosen randomly from a specified list.
+   They include built-in scripts

Re: [HACKERS] pgbench stats per script & other stuff

2016-01-26 Thread Alvaro Herrera

Fabien COELHO wrote:

> a) add -b option for cumulating builtins and rework internal script
>management so that builtin and external scripts are managed the
>same way.

I'm uncomfortable with the prefix-matching aspect of -b.  It makes
"-b s" ambiguous -- whether it stands for select-only or simple-update
is merely a matter of what comes earlier in the table, which doesn't
seem reasonable to me.  I'd rather have a real way to reject ambiguous
cases, or simply accept only complete spellings.  This is the guilty
party:

> +static char *
> +find_builtin(const char *name, char **desc)
> +{
> + int len = strlen(name), i;
> +
> + for (i = 0; i < N_BUILTIN; i++)
> + {
> + if (strncmp(builtin_script[i].name, name, len) == 0)
> + {
> + *desc = builtin_script[i].desc;
> + return builtin_script[i].script;
> + }
> + }

I'm going to change this to use strlen(builtin_script[i].name) instead
of "len" here.

If you want to implement real non-ambiguous-prefix code (i.e. have "se"
for "select-only", but reject "s" as ambiguous) be my guest.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2016-01-23 Thread Fabien COELHO



Hello Alvaro,


I'm looking at this part of your patch and I think it's far too big to
be a simple refactoring. Would you split it up please?


You know how delighted I am to split patches...

Here is a 5 part ordered patch serie:

a) add -b option for cumulating builtins and rework internal script
   management so that builtin and external scripts are managed the
   same way.

b) refactor statistics collections (per thread, per command, per whatever)
   so as to use the same structure everywhere, reducing the CLOC by 115.
   this enables the next small patch which can reuse the new functions.

c) add per-script statistics... because Josh asked:-)

d) add optional weight to control the relative frequency of scripts.

e) minor code cleanup :
   use bool instead of int where appropriate
   put together struct fields when they belong together
   move 2 options at their right position in the list

This patch serie conflicts slightly with the "add functions to pgbench" 
patch which is marked as ready in the CF. The first to make it will mean 
some conflict resolution for the other. Maybe I would prefer this one 
serie to go first, if I had any say...


--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 541d17b..fdd3331 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -261,6 +261,23 @@ pgbench  options  dbname
 benchmarking arguments:
 
 
+ 
+  -b scriptname
+  --builtin scriptname
+  
+   
+Add the specified builtin script to the list of executed scripts.
+Available builtin scripts are: tpcb-like,
+simple-update and select-only.
+The provided scriptname needs only to be a prefix
+of the builtin name, hence simp would be enough to select
+simple-update.
+With special name list, show the list of builtin scripts
+and exit immediately.
+   
+  
+ 
+
 
  
   -c clients
@@ -307,14 +324,13 @@ pgbench  options  dbname
  
 
  
-  -f filename
-  --file=filename
+  -f filename
+  --file=filename
   

-Read transaction script from filename.
+Add a transaction script read from filename to
+the list of executed scripts.
 See below for details.
--N, -S, and -f
-are mutually exclusive.

   
  
@@ -404,10 +420,8 @@ pgbench  options  dbname
   --skip-some-updates
   

-Do not update pgbench_tellers and
-pgbench_branches.
-This will avoid update contention on these tables, but
-it makes the test case even less like TPC-B.
+Run builtin simple-update script.
+Shorthand for -b simple-update.

   
  
@@ -512,9 +526,9 @@ pgbench  options  dbname
 Report the specified scale factor in pgbench's
 output.  With the built-in tests, this is not necessary; the
 correct scale factor will be detected by counting the number of
-rows in the pgbench_branches table.  However, when testing
-custom benchmarks (-f option), the scale factor
-will be reported as 1 unless this option is used.
+rows in the pgbench_branches table.
+However, when testing only custom benchmarks (-f option),
+the scale factor will be reported as 1 unless this option is used.

   
  
@@ -524,7 +538,8 @@ pgbench  options  dbname
   --select-only
   

-Perform select-only transactions instead of TPC-B-like test.
+Run built-in select-only script.
+Shorthand for -b select-only.

   
  
@@ -674,7 +689,17 @@ pgbench  options  dbname
   What is the Transaction Actually Performed in pgbench?
 
   
-   The default transaction script issues seven commands per transaction:
+   Pgbench executes test scripts chosen randomly from a specified list.
+   They include built-in scripts with -b and
+   user-provided custom scripts with -f.
+ 
+
+  
+   The default builtin transaction script (also invoked with -b tpcb-like)
+   issues seven commands per transaction over randomly chosen aid,
+   tid, bid and balance.
+   The scenario is inspired by the TPC-B benchmark, but is not actually TPC-B,
+   hence the name.
   
 
   
@@ -688,9 +713,15 @@ pgbench  options  dbname
   
 
   
-   If you specify -N, steps 4 and 5 aren't included in the
-   transaction.  If you specify -S, only the SELECT is
-   issued.
+   If you select the simple-update builtin (also -N),
+   steps 4 and 5 aren't included in the transaction.
+   This will avoid update contention on these tables, but
+   it makes the test case even less like TPC-B.
+  
+
+  
+   If you select the select-only builtin (also -S),
+   only the SELECT is issued.
   
  
 
@@ -702,10 +733,7 @@ pgbench  options  dbname
benchmark scenarios by replacing the default transaction script
(described above) with a transaction script read from a file

Re: [HACKERS] pgbench stats per script & other stuff

2016-01-13 Thread Fabien COELHO



Hello Alvaro,


Here is a two part v12, which:

part a (refactoring of scripts and their stats):
 - fix option checks (-i alone)
 - s/repleacable/replaceable/ in doc
 - keep small description in doc and help for -S & -N
 - fix 2 comments for pg style
 - show builtin list if not found


I'm looking at this part of your patch and I think it's far too big to
be a simple refactoring.  Would you split it up please?
I think the StatsData / SimpleStat addition should be one patch;
then there's the -b changes.  Then there may (or may not) be a bunch of 
other minor cleanups, not sure.


I'm willing to commit these patches if I can easily review what they do,
which I cannot with the current state.


Hmmm. ISTM that other people already reviewed it.

I can try to separate (again) some stuff, but there will be no miracle.

The overdue refactoring is because pgbench collects statistics at various 
levels, and each time this is done in a different way. Cleaning this 
requires to touch the stuff in many places, which means a "big" patch, 
although ISTM a straightforward one, but this already the case with this 
one.



Please pgindent; make sure to add /*--- here to avoid pgindent mangling
the comment:


Ok.


part b (weight)
 - check that the weight is an int


This part looks okay to me.  Minor nitpick,

+   int i = 0, w = 0, wc = (int) getrand(thread, 0, total_weight - 1);

should be three lines, not one.


Ok.

Also the @W part in the --help output should be in brackets, as 
FILE[@W], right?


Why not.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2016-01-12 Thread Alvaro Herrera

Fabien COELHO wrote:
> 
> Here is a two part v12, which:
> 
> part a (refactoring of scripts and their stats):
>  - fix option checks (-i alone)
>  - s/repleacable/replaceable/ in doc
>  - keep small description in doc and help for -S & -N
>  - fix 2 comments for pg style
>  - show builtin list if not found

I'm looking at this part of your patch and I think it's far too big to
be a simple refactoring.  Would you split it up please?  I think the
StatsData / SimpleStat addition should be one patch; then there's the -b
changes.  Then there may (or may not) be a bunch of other minor
cleanups, not sure.

I'm willing to commit these patches if I can easily review what they do,
which I cannot with the current state.

Please pgindent; make sure to add /*--- here to avoid pgindent mangling
the comment:

if (pg_strcasecmp(my_commands->argv[0], "setrandom") == 0)
{
/*
 * parsing:

> part b (weight)
>  - check that the weight is an int

This part looks okay to me.  Minor nitpick,

+   int i = 0, w = 0, wc = (int) getrand(thread, 0, total_weight - 1);

should be three lines, not one.  Also the @W part in the --help output
should be in brackets, as FILE[@W], right?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2015-12-28 Thread Fabien COELHO



Hello Michaël,


And then I also had a look at src/port/snprintf.c, where things get
actually weird when no transactions are run for a script (emulated
with 2 scripts, one with @1 and the second with @1):
 - 0 transactions (0.0% of total, tps = 0.00)
 - latency average = -1.#IO ms
 - latency stddev = -1.#IO ms
And it seems that this is a bug in fmtfloat() because it does not
handle nan values correctly. Others, any thoughts about that?
It is possible to address things within your patch by using isnan()
and print another message but I think that we had better patch
snprintf.c if my analysis is right.


FWIW, I just had a closer look at this portion and I arrived at the
conclusion that sprintf implementation on Windows is just broken as it
is not able to handle appropriately inf or nan as exceptions.
fmtfloat@src/port/snprintf.c relies on the system's implementation of
sprintf to handle those exceptions, however even directly calling
sprintf results in the same weird output, inf showing up as "1.#IO"
and nan as "-1.#IO". Anyone, feel free to disagree if I am missing
something.


I have no opinion any about M$ implementation of double prettyprinting, 
but I agree that "-1.#IO" looks strange. WWW seems to say that "-1.INF" 
and "-1.IND" are the "normal" way for windows to say infinity or not a 
number. Well, if someone there thought it look good, I cannot help it.



Still, it would be cool to have better error message when there is no
value to show up to the user, like "no latency average" or "undefined
latency average". That would be more elegant, and the patches proposed
still lack that.


Hmmm. I do not buy that for several reasons:

For --progress style reporting you want NaN or whatever, because the 
output could be processed further unix-style from a pipe (grep/cut/...). 
This is also true for the final report. I would not want to change the 
output organisations for some special values, I would just like to get the 
value whatever it is, "NaN" or "Infinity" or even "-1.IND", so that the 
pipe commands would work.


Also, for the final report, it seems to me overkill to try to work around 
cases when pgbench does not run any transactions, which is basically not 
often, as the point is to run many transactions.


Finally this behavior already exists, the patch does not change anything 
AFAICS, and it is not its purpose.


So I would suggest to keep it that way.

--
Fabien.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2015-12-28 Thread Michael Paquier

On Wed, Dec 16, 2015 at 4:09 PM, Michael Paquier
 wrote:
> Yeah, that's actually fine. I just had a look at Windows stuff, and
> things seem to be correct on this side for double:
> https://msdn.microsoft.com/en-us/library/aa691373%28v=vs.71%29.aspx
> And then I also had a look at src/port/snprintf.c, where things get
> actually weird when no transactions are run for a script (emulated
> with 2 scripts, one with @1 and the second with @1):
>  - 0 transactions (0.0% of total, tps = 0.00)
>  - latency average = -1.#IO ms
>  - latency stddev = -1.#IO ms
> And it seems that this is a bug in fmtfloat() because it does not
> handle nan values correctly. Others, any thoughts about that?
> It is possible to address things within your patch by using isnan()
> and print another message but I think that we had better patch
> snprintf.c if my analysis is right.

FWIW, I just had a closer look at this portion and I arrived at the
conclusion that sprintf implementation on Windows is just broken as it
is not able to handle appropriately inf or nan as exceptions.
fmtfloat@src/port/snprintf.c relies on the system's implementation of
sprintf to handle those exceptions, however even directly calling
sprintf results in the same weird output, inf showing up as "1.#IO"
and nan as "-1.#IO". Anyone, feel free to disagree if I am missing
something.

Still, it would be cool to have better error message when there is no
value to show up to the user, like "no latency average" or "undefined
latency average". That would be more elegant, and the patches proposed
still lack that.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2015-12-16 Thread Fabien COELHO



Here is a two part v12, which:

part a (refactoring of scripts and their stats):
 - fix option checks (-i alone)
 - s/repleacable/replaceable/ in doc
 - keep small description in doc and help for -S & -N
 - fix 2 comments for pg style
 - show builtin list if not found

part b (weight)
 - check that the weight is an int

--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 0ac40f1..62ce496 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -261,6 +261,23 @@ pgbench  options  dbname
 benchmarking arguments:
 
 
+ 
+  -b scriptname
+  --builtin scriptname
+  
+   
+Add the specified builtin script to the list of executed scripts.
+Available builtin scripts are: tpcb-like,
+simple-update and select-only.
+The provided scriptname needs only to be a prefix
+of the builtin name, hence simp would be enough to select
+simple-update.
+With special name list, show the list of builtin scripts
+and exit immediately.
+   
+  
+ 
+
 
  
   -c clients
@@ -307,14 +324,13 @@ pgbench  options  dbname
  
 
  
-  -f filename
-  --file=filename
+  -f filename
+  --file=filename
   

-Read transaction script from filename.
+Add a transaction script read from filename to
+the list of executed scripts.
 See below for details.
--N, -S, and -f
-are mutually exclusive.

   
  
@@ -404,10 +420,8 @@ pgbench  options  dbname
   --skip-some-updates
   

-Do not update pgbench_tellers and
-pgbench_branches.
-This will avoid update contention on these tables, but
-it makes the test case even less like TPC-B.
+Run builtin simple-update script.
+Shorthand for -b simple-update.

   
  
@@ -512,9 +526,9 @@ pgbench  options  dbname
 Report the specified scale factor in pgbench's
 output.  With the built-in tests, this is not necessary; the
 correct scale factor will be detected by counting the number of
-rows in the pgbench_branches table.  However, when testing
-custom benchmarks (-f option), the scale factor
-will be reported as 1 unless this option is used.
+rows in the pgbench_branches table.
+However, when testing only custom benchmarks (-f option),
+the scale factor will be reported as 1 unless this option is used.

   
  
@@ -524,7 +538,8 @@ pgbench  options  dbname
   --select-only
   

-Perform select-only transactions instead of TPC-B-like test.
+Run built-in select-only script.
+Shorthand for -b select-only.

   
  
@@ -674,7 +689,17 @@ pgbench  options  dbname
   What is the Transaction Actually Performed in pgbench?
 
   
-   The default transaction script issues seven commands per transaction:
+   Pgbench executes test scripts chosen randomly from a specified list.
+   They include built-in scripts with -b and
+   user-provided custom scripts with -f.
+ 
+
+  
+   The default builtin transaction script (also invoked with -b tpcb-like)
+   issues seven commands per transaction over randomly chosen aid,
+   tid, bid and balance.
+   The scenario is inspired by the TPC-B benchmark, but is not actually TPC-B,
+   hence the name.
   
 
   
@@ -688,9 +713,15 @@ pgbench  options  dbname
   
 
   
-   If you specify -N, steps 4 and 5 aren't included in the
-   transaction.  If you specify -S, only the SELECT is
-   issued.
+   If you select the simple-update builtin (also -N),
+   steps 4 and 5 aren't included in the transaction.
+   This will avoid update contention on these tables, but
+   it makes the test case even less like TPC-B.
+  
+
+  
+   If you select the select-only builtin (also -S),
+   only the SELECT is issued.
   
  
 
@@ -702,10 +733,7 @@ pgbench  options  dbname
benchmark scenarios by replacing the default transaction script
(described above) with a transaction script read from a file
(-f option).  In this case a transaction
-   counts as one execution of a script file.  You can even specify
-   multiple scripts (multiple -f options), in which
-   case a random one of the scripts is chosen each time a client session
-   starts a new transaction.
+   counts as one execution of a script file.
   
 
   
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index f2d435b..1e0c7cc 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -164,12 +164,11 @@ bool		use_log;			/* log transaction latencies to a file */
 bool		use_quiet;			/* quiet logging onto stderr */
 int			agg_interval;		/* log aggregates instead of individual
  * transactions */
+boolper_script_stats = false; /* whether to collect stats per script */
 int			progress = 0;		/* thread progress report every

Re: [HACKERS] pgbench stats per script & other stuff

2015-12-16 Thread Fabien COELHO




It seems also that it would be a good idea to split the patch into two
parts:
1) Refactor the code so as the existing test scripts are put under the
same umbrella with addScript, adding at the same time the new option
-b.
2) Add the weight facility and its related statistics.


Sigh. The patch & documentation are probably not independent, so that would
make two dependent patches, probably.


I am not really saying so, it seems just that doing the refactoring
(with its related docs), and then add the extension for the weight
(with its docs) is more natural than doing both things at the same
time.


Ok. I can separate the refactoring (scripts & stats) and the weight stuff 
on top of the refactoring.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2015-12-15 Thread Fabien COELHO




"sum" is a double so count is converted to 0.0, 0.0/0.0 == NaN, hence the
comment.


PG code usually avoids that, and I recall static analyze tools type
coverity complaining that this may lead to undefined behavior. While I
agree that this would lead to NaN...


Hmmm. In this case that is what is actually wanted. If there is no 
transaction, the tps or average latency or whatever is "NaN", I cannot 
help it, and IEEE 754 allow that. So in this case the tool is wrong if it 
complains, or at least we are right to ignore the warning. Maybe there is 
some special comment to say "ignore this warning on the next line" if it 
occurs, if this is an issue.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2015-12-15 Thread Michael Paquier

On Tue, Dec 15, 2015 at 8:41 PM, Fabien COELHO  wrote:
>> PG code usually avoids that, and I recall static analyze tools type
>> coverity complaining that this may lead to undefined behavior. While I
>> agree that this would lead to NaN...
>
>
> Hmmm. In this case that is what is actually wanted. If there is no
> transaction, the tps or average latency or whatever is "NaN", I cannot help
> it, and IEEE 754 allow that. So in this case the tool is wrong if it
> complains, or at least we are right to ignore the warning. Maybe there is
> some special comment to say "ignore this warning on the next line" if it
> occurs, if this is an issue.

Yeah, that's actually fine. I just had a look at Windows stuff, and
things seem to be correct on this side for double:
https://msdn.microsoft.com/en-us/library/aa691373%28v=vs.71%29.aspx
And then I also had a look at src/port/snprintf.c, where things get
actually weird when no transactions are run for a script (emulated
with 2 scripts, one with @1 and the second with @1):
 - 0 transactions (0.0% of total, tps = 0.00)
 - latency average = -1.#IO ms
 - latency stddev = -1.#IO ms
And it seems that this is a bug in fmtfloat() because it does not
handle nan values correctly. Others, any thoughts about that?
It is possible to address things within your patch by using isnan()
and print another message but I think that we had better patch
snprintf.c if my analysis is right.

Oh, and actually when trying to compile the patch on Windows things
are failing because int64_t is undefined :) After switching to int64
things worked better.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2015-12-14 Thread Michael Paquier

On Tue, Dec 15, 2015 at 5:53 AM, Fabien COELHO  wrote:
> I wrote:
>> Why would the likelyhood of an issue be small here?
>
> The time to update one stat (<< 100 cycles ?) to the time to do a
> transaction with the database (typically Y ms), so the likelyhood of two
> thread to update the very same stat at the same time is probably under
> 1/10,000,000. Even if it occurs, then one stat is slightly false, no big
> deal. So I think the potential slowdown induced by a mutex is not worth it,
> so I a comment instead.

OK, got it and agreed.

>> +   /* print NaN if no transactions where executed */
>> +   double latency = ss->sum / ss->count;
>> This does not look like a good idea, ss->count can be 0.
>
>
> "sum" is a double so count is converted to 0.0, 0.0/0.0 == NaN, hence the
> comment.

PG code usually avoids that, and I recall static analyze tools type
coverity complaining that this may lead to undefined behavior. While I
agree that this would lead to NaN...

>> It seems also that it would be a good idea to split the patch into two
>> parts:
>> 1) Refactor the code so as the existing test scripts are put under the
>> same umbrella with addScript, adding at the same time the new option
>> -b.
>> 2) Add the weight facility and its related statistics.
>
>
> Sigh. The patch & documentation are probably not independent, so that would
> make two dependent patches, probably.

I am not really saying so, it seems just that doing the refactoring
(with its related docs), and then add the extension for the weight
(with its docs) is more natural than doing both things at the same
time.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2015-12-14 Thread Fabien COELHO




-Do not update pgbench_tellers and
-pgbench_branches.
-This will avoid update contention on these tables, but
-it makes the test case even less like TPC-B.
+Shorthand for -b simple-update@1.



I don't think it is a good idea to remove entirely the description of
what the default scenarios can do. The description would be better at
the bottom in some  with a list of each default test and what to
expect from them.


I'm trying to avoid to have the same explanation twice, otherwise someone 
is bound to complain.



+/* data structure to hold various statistics.
+ * it is used for interval statistics as well as file statistics.
 */
Nitpick: this is not a comment formatted the Postgres-way.


Indeed.


This is surprisingly broken:
$ pgbench -i
some of the specified options cannot be used in initialization (-i) mode


Hmmm.


Any file name or path including "@" will fail strangely:
$ pgbench -f "t...@1.sql"
could not open file "test": No such file or directory
empty commands for test
Perhaps instead of failing we should warn the user and enforce the
weight to be set at 1?


Yep, I can have a look at that.


$ pgbench -b foo
no builtin found for "foo"
This is not really helpful for the user, I think that the list of
potential options should be listed as an error hint.


Yep.


-  "  -S, --select-onlyperform SELECT-only
transactions\n"
+  "  -S, --select-onlysame as \"-b select-only@1\"\n"
It is good to mention that there is an equivalent, but I think that
the description should be kept.


The reason replace it is to keep the help message short column-wise.


+   /* although a mutex would make sense, the
likelyhood of an issue
+* is small and these are only stats which may
be slightly false
+*/
+   doSimpleStats(& commands[st->state]->stats,
+ INSTR_TIME_GET_DOUBLE(now) -




Why would the likelyhood of an issue be small here?


The time to update one stat (<< 100 cycles ?) to the time to do a 
transaction with the database (typically Y ms), so the likelyhood of two 
thread to update the very same stat at the same time is probably under 
1/10,000,000. Even if it occurs, then one stat is slightly false, no big 
deal. So I think the potential slowdown induced by a mutex is not worth 
it, so I a comment instead.



+   /* print NaN if no transactions where executed */
+   double latency = ss->sum / ss->count;
This does not look like a good idea, ss->count can be 0.


"sum" is a double so count is converted to 0.0, 0.0/0.0 == NaN, hence the 
comment.



It seems also that it would be a good idea to split the patch into two parts:
1) Refactor the code so as the existing test scripts are put under the
same umbrella with addScript, adding at the same time the new option
-b.
2) Add the weight facility and its related statistics.


Sigh. The patch & documentation are probably not independent, so that 
would make two dependent patches, probably.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2015-12-13 Thread Michael Paquier

On Sat, Oct 3, 2015 at 3:11 PM, Fabien COELHO  wrote:
>
>> Here is a v10, which is a rebase because of the "--progress-timestamp"
>> option addition.
>
>
> Here is a v11, which is a rebase after some recent changes committed to
> pgbench.

+The provided scriptname needs only to be a prefix
s/repleacable/replaceable, in short I think that documentation
compilation would fail.

-Do not update pgbench_tellers and
-pgbench_branches.
-This will avoid update contention on these tables, but
-it makes the test case even less like TPC-B.
+Shorthand for -b simple-update@1.
I don't think it is a good idea to remove entirely the description of
what the default scenarios can do. The description would be better at
the bottom in some  with a list of each default test and what to
expect from them.

+/* data structure to hold various statistics.
+ * it is used for interval statistics as well as file statistics.
  */
Nitpick: this is not a comment formatted the Postgres-way.

This is surprisingly broken:
$ pgbench -i
some of the specified options cannot be used in initialization (-i) mode

Any file name or path including "@" will fail strangely:
$ pgbench -f "t...@1.sql"
could not open file "test": No such file or directory
empty commands for test
Perhaps instead of failing we should warn the user and enforce the
weight to be set at 1?

$ pgbench -b foo
no builtin found for "foo"
This is not really helpful for the user, I think that the list of
potential options should be listed as an error hint.

-  "  -N, --skip-some-updates  skip updates of
pgbench_tellers and pgbench_branches\n"
+  "  -N, --skip-some-updates  same as \"-b simple-update@1\"\n"
   "  -P, --progress=NUM   show thread progress
report every NUM seconds\n"
   "  -r, --report-latencies   report average latency
per command\n"
"  -R, --rate=NUM   target rate in
transactions per second\n"
   "  -s, --scale=NUM  report this scale
factor in output\n"
-  "  -S, --select-onlyperform SELECT-only
transactions\n"
+  "  -S, --select-onlysame as \"-b select-only@1\"\n"
It is good to mention that there is an equivalent, but I think that
the description should be kept.

+   /* although a mutex would make sense, the
likelyhood of an issue
+* is small and these are only stats which may
be slightly false
+*/
+   doSimpleStats(& commands[st->state]->stats,
+ INSTR_TIME_GET_DOUBLE(now) -
+
INSTR_TIME_GET_DOUBLE(st->stmt_begin));
Why would the likelyhood of an issue be small here?

+   /* print NaN if no transactions where executed */
+   double latency = ss->sum / ss->count;
This does not look like a good idea, ss->count can be 0.

It seems also that it would be a good idea to split the patch into two parts:
1) Refactor the code so as the existing test scripts are put under the
same umbrella with addScript, adding at the same time the new option
-b.
2) Add the weight facility and its related statistics.

The patch having some issues, I am marking it as returned with
feedback. It would be nice to see a new version for next CF.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2015-10-03 Thread Fabien COELHO



Here is a v10, which is a rebase because of the "--progress-timestamp" option 
addition.


Here is a v11, which is a rebase after some recent changes committed to 
pgbench.


--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 0ac40f1..e3a90e5 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -261,6 +261,25 @@ pgbench  options  dbname
 benchmarking arguments:
 
 
+ 
+  -b scriptname[@weight]
+  --builtin scriptname[@weight]
+  
+   
+Add the specified builtin script to the list of executed scripts.
+An optional integer weight after @ allows to adjust the
+probability of drawing the test.
+Available builtin scripts are: tpcb-like,
+simple-update and select-only.
+The provided scriptname needs only to be a prefix
+of the builtin name, hence simp would be enough to select
+simple-update.
+With special name list, show the list of builtin scripts
+and exit immediately.
+   
+  
+ 
+
 
  
   -c clients
@@ -307,14 +326,15 @@ pgbench  options  dbname
  
 
  
-  -f filename
-  --file=filename
+  -f filename[@weight]
+  --file=filename[@weight]
   

-Read transaction script from filename.
+Add a transaction script read from filename to
+the list of executed scripts.
+An optional integer weight after @ allows to adjust the
+probability of drawing the test.
 See below for details.
--N, -S, and -f
-are mutually exclusive.

   
  
@@ -404,10 +424,7 @@ pgbench  options  dbname
   --skip-some-updates
   

-Do not update pgbench_tellers and
-pgbench_branches.
-This will avoid update contention on these tables, but
-it makes the test case even less like TPC-B.
+Shorthand for -b simple-update@1.

   
  
@@ -512,9 +529,9 @@ pgbench  options  dbname
 Report the specified scale factor in pgbench's
 output.  With the built-in tests, this is not necessary; the
 correct scale factor will be detected by counting the number of
-rows in the pgbench_branches table.  However, when testing
-custom benchmarks (-f option), the scale factor
-will be reported as 1 unless this option is used.
+rows in the pgbench_branches table.
+However, when testing only custom benchmarks (-f option),
+the scale factor will be reported as 1 unless this option is used.

   
  
@@ -524,7 +541,7 @@ pgbench  options  dbname
   --select-only
   

-Perform select-only transactions instead of TPC-B-like test.
+Shorthand for -b select-only@1.

   
  
@@ -674,7 +691,20 @@ pgbench  options  dbname
   What is the Transaction Actually Performed in pgbench?
 
   
-   The default transaction script issues seven commands per transaction:
+   Pgbench executes test scripts chosen randomly from a specified list.
+   They include built-in scripts with -b and
+   user-provided custom scripts with -f.
+   Each script may be given a relative weight specified after a
+   @ so as to change its drawing probability.
+   The default weight is 1.
+ 
+
+  
+   The default builtin transaction script (also invoked with -b tpcb-like)
+   issues seven commands per transaction over randomly chosen aid,
+   tid, bid and balance.
+   The scenario is inspired by the TPC-B benchmark, but is not actually TPC-B,
+   hence the name.
   
 
   
@@ -688,9 +718,15 @@ pgbench  options  dbname
   
 
   
-   If you specify -N, steps 4 and 5 aren't included in the
-   transaction.  If you specify -S, only the SELECT is
-   issued.
+   If you select the simple-update builtin (also -N),
+   steps 4 and 5 aren't included in the transaction.
+   This will avoid update contention on these tables, but
+   it makes the test case even less like TPC-B.
+  
+
+  
+   If you select the select-only builtin (also -S),
+   only the SELECT is issued.
   
  
 
@@ -702,10 +738,7 @@ pgbench  options  dbname
benchmark scenarios by replacing the default transaction script
(described above) with a transaction script read from a file
(-f option).  In this case a transaction
-   counts as one execution of a script file.  You can even specify
-   multiple scripts (multiple -f options), in which
-   case a random one of the scripts is chosen each time a client session
-   starts a new transaction.
+   counts as one execution of a script file.
   
 
   
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index f2d435b..0e56ed7 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -164,12 +164,11 @@ bool		use_log;			/* log transaction latencies to a file */
 bool		use_quiet;			/* quiet logging onto stderr */
 int			agg_interval;		/* log aggregates instead of individual

Re: [HACKERS] pgbench stats per script & other stuff

2015-09-28 Thread Robert Haas

On Sat, Sep 26, 2015 at 3:27 AM, Fabien COELHO  wrote:
> Here is a v10, which is a rebase because of the "--progress-timestamp"
> option addition.

I do not see it attached.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2015-09-28 Thread Fabien COELHO




Here is a v10, which is a rebase because of the "--progress-timestamp"
option addition.


I do not see it attached.


Indeed. Here it is.

--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 0ac40f1..e3a90e5 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -261,6 +261,25 @@ pgbench  options  dbname
 benchmarking arguments:
 
 
+ 
+  -b scriptname[@weight]
+  --builtin scriptname[@weight]
+  
+   
+Add the specified builtin script to the list of executed scripts.
+An optional integer weight after @ allows to adjust the
+probability of drawing the test.
+Available builtin scripts are: tpcb-like,
+simple-update and select-only.
+The provided scriptname needs only to be a prefix
+of the builtin name, hence simp would be enough to select
+simple-update.
+With special name list, show the list of builtin scripts
+and exit immediately.
+   
+  
+ 
+
 
  
   -c clients
@@ -307,14 +326,15 @@ pgbench  options  dbname
  
 
  
-  -f filename
-  --file=filename
+  -f filename[@weight]
+  --file=filename[@weight]
   

-Read transaction script from filename.
+Add a transaction script read from filename to
+the list of executed scripts.
+An optional integer weight after @ allows to adjust the
+probability of drawing the test.
 See below for details.
--N, -S, and -f
-are mutually exclusive.

   
  
@@ -404,10 +424,7 @@ pgbench  options  dbname
   --skip-some-updates
   

-Do not update pgbench_tellers and
-pgbench_branches.
-This will avoid update contention on these tables, but
-it makes the test case even less like TPC-B.
+Shorthand for -b simple-update@1.

   
  
@@ -512,9 +529,9 @@ pgbench  options  dbname
 Report the specified scale factor in pgbench's
 output.  With the built-in tests, this is not necessary; the
 correct scale factor will be detected by counting the number of
-rows in the pgbench_branches table.  However, when testing
-custom benchmarks (-f option), the scale factor
-will be reported as 1 unless this option is used.
+rows in the pgbench_branches table.
+However, when testing only custom benchmarks (-f option),
+the scale factor will be reported as 1 unless this option is used.

   
  
@@ -524,7 +541,7 @@ pgbench  options  dbname
   --select-only
   

-Perform select-only transactions instead of TPC-B-like test.
+Shorthand for -b select-only@1.

   
  
@@ -674,7 +691,20 @@ pgbench  options  dbname
   What is the Transaction Actually Performed in pgbench?
 
   
-   The default transaction script issues seven commands per transaction:
+   Pgbench executes test scripts chosen randomly from a specified list.
+   They include built-in scripts with -b and
+   user-provided custom scripts with -f.
+   Each script may be given a relative weight specified after a
+   @ so as to change its drawing probability.
+   The default weight is 1.
+ 
+
+  
+   The default builtin transaction script (also invoked with -b tpcb-like)
+   issues seven commands per transaction over randomly chosen aid,
+   tid, bid and balance.
+   The scenario is inspired by the TPC-B benchmark, but is not actually TPC-B,
+   hence the name.
   
 
   
@@ -688,9 +718,15 @@ pgbench  options  dbname
   
 
   
-   If you specify -N, steps 4 and 5 aren't included in the
-   transaction.  If you specify -S, only the SELECT is
-   issued.
+   If you select the simple-update builtin (also -N),
+   steps 4 and 5 aren't included in the transaction.
+   This will avoid update contention on these tables, but
+   it makes the test case even less like TPC-B.
+  
+
+  
+   If you select the select-only builtin (also -S),
+   only the SELECT is issued.
   
  
 
@@ -702,10 +738,7 @@ pgbench  options  dbname
benchmark scenarios by replacing the default transaction script
(described above) with a transaction script read from a file
(-f option).  In this case a transaction
-   counts as one execution of a script file.  You can even specify
-   multiple scripts (multiple -f options), in which
-   case a random one of the scripts is chosen each time a client session
-   starts a new transaction.
+   counts as one execution of a script file.
   
 
   
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 6ae1b86..0e56ed7 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -164,12 +164,11 @@ bool		use_log;			/* log transaction latencies to a file */
 bool		use_quiet;			/* quiet logging onto stderr */
 int			agg_interval;		/* log aggregates instead of individual
  * transactions */
+bool

Re: [HACKERS] pgbench stats per script & other stuff

2015-09-26 Thread Fabien COELHO



Here is a v10, which is a rebase because of the "--progress-timestamp" 
option addition.


It also include the fix for the tps without connection computation and 
some minor code simplification, so it is redundant with this bug fix 
patch:


https://commitfest.postgresql.org/7/378/

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2015-09-03 Thread Robert Haas

On Thu, Sep 3, 2015 at 3:26 AM, Fabien COELHO  wrote:
> I've left out:
>  - removing -N/-S upward compatibility shorthands
>but I will not cry if they are removed

I see no particular merit to breaking backward compatibility here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2015-09-03 Thread Fabien COELHO




I've left out:
 - removing -N/-S upward compatibility shorthands
   but I will not cry if they are removed


I see no particular merit to breaking backward compatibility here.


I agree, but I would not fight for this. I think there is a good argument 
*NOT* to add more if new builtin scripts are added later.


Currently the builtin script can be selected with "-b t" (t for tcpb-like), 
"-b s" (s for simple-update) and "-b se" (se for select-only).


I've reused their current names for the option selector, and it takes the 
first matching prefix.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2015-09-03 Thread Andres Freund

On 2015-09-02 20:20:45 +0200, Fabien COELHO wrote:
> >>+static SQLScript sql_script[MAX_SCRIPTS];
> >>
> >>+static struct {
> >>+   char *name;   /* very short name for -b ...*/
> >>+   char *desc;   /* short description */
> >>+   char *script; /* actual pgbench script */
> >>+} builtin_script[]
> >
> >Can't we put these in the same array?
> 
> I do not understand.

Right now builtins and user defined scripts are stored in different data
structures. I'd rather see them in the same.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2015-09-03 Thread Fabien COELHO



Hello Anders,

This v9 :
 - add "-b list" to show the list of builtins
 - remove the explicit --per-scripts-stats option, which is instead
   automatically set when several scripts are run or with per-command
   latencies (-r)
 - count scripts from 1 instead of 0 in the output

I've left out:
 - removing -N/-S upward compatibility shorthands
   but I will not cry if they are removed
 - requiring percents instead of integer weights, because
   it is too constrained
 - your "array" remark as I did not understood it

Thanks to the restructuring and sharing of stats code, the patch does not 
change the loc count, although a few features are added.


--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 2517a3a..3edd65a 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -261,6 +261,25 @@ pgbench  options  dbname
 benchmarking arguments:
 
 
+ 
+  -b scriptname[@weight]
+  --builtin scriptname[@weight]
+  
+   
+Add the specified builtin script to the list of executed scripts.
+An optional integer weight after @ allows to adjust the
+probability of drawing the test.
+Available builtin scripts are: tpcb-like,
+simple-update and select-only.
+The provided scriptname needs only to be a prefix
+of the builtin name, hence simp would be enough to select
+simple-update.
+With special name list, show the list of builtin scripts
+and exit immediately.
+   
+  
+ 
+
 
  
   -c clients
@@ -307,14 +326,15 @@ pgbench  options  dbname
  
 
  
-  -f filename
-  --file=filename
+  -f filename[@weight]
+  --file=filename[@weight]
   

-Read transaction script from filename.
+Add a transaction script read from filename to
+the list of executed scripts.
+An optional integer weight after @ allows to adjust the
+probability of drawing the test.
 See below for details.
--N, -S, and -f
-are mutually exclusive.

   
  
@@ -404,10 +424,7 @@ pgbench  options  dbname
   --skip-some-updates
   

-Do not update pgbench_tellers and
-pgbench_branches.
-This will avoid update contention on these tables, but
-it makes the test case even less like TPC-B.
+Shorthand for -b simple-update@1.

   
  
@@ -499,9 +516,9 @@ pgbench  options  dbname
 Report the specified scale factor in pgbench's
 output.  With the built-in tests, this is not necessary; the
 correct scale factor will be detected by counting the number of
-rows in the pgbench_branches table.  However, when testing
-custom benchmarks (-f option), the scale factor
-will be reported as 1 unless this option is used.
+rows in the pgbench_branches table.
+However, when testing only custom benchmarks (-f option),
+the scale factor will be reported as 1 unless this option is used.

   
  
@@ -511,7 +528,7 @@ pgbench  options  dbname
   --select-only
   

-Perform select-only transactions instead of TPC-B-like test.
+Shorthand for -b select-only@1.

   
  
@@ -661,7 +678,20 @@ pgbench  options  dbname
   What is the Transaction Actually Performed in pgbench?
 
   
-   The default transaction script issues seven commands per transaction:
+   Pgbench executes test scripts chosen randomly from a specified list.
+   They include built-in scripts with -b and
+   user-provided custom scripts with -f.
+   Each script may be given a relative weight specified after a
+   @ so as to change its drawing probability.
+   The default weight is 1.
+ 
+
+  
+   The default builtin transaction script (also invoked with -b tpcb-like)
+   issues seven commands per transaction over randomly chosen aid,
+   tid, bid and balance.
+   The scenario is inspired by the TPC-B benchmark, but is not actually TPC-B,
+   hence the name.
   
 
   
@@ -675,9 +705,15 @@ pgbench  options  dbname
   
 
   
-   If you specify -N, steps 4 and 5 aren't included in the
-   transaction.  If you specify -S, only the SELECT is
-   issued.
+   If you select the simple-update builtin (also -N),
+   steps 4 and 5 aren't included in the transaction.
+   This will avoid update contention on these tables, but
+   it makes the test case even less like TPC-B.
+  
+
+  
+   If you select the select-only builtin (also -S),
+   only the SELECT is issued.
   
  
 
@@ -689,10 +725,7 @@ pgbench  options  dbname
benchmark scenarios by replacing the default transaction script
(described above) with a transaction script read from a file
(-f option).  In this case a transaction
-   counts as one execution of a script file.  You can even specify
-   multiple scripts (multiple -f options), in which
-   case a random one of the scripts

Re: [HACKERS] pgbench stats per script & other stuff

2015-09-03 Thread Fabien COELHO





Right now builtins and user defined scripts are stored in different data
structures. I'd rather see them in the same.


They already are in the same array (sql_script) when pre-processed and 
executed, there is no distinction beyond initialization.


The builtin_script array contains the equivalent of the external custom 
file (name, lines of code), so that they can be processed by 
process_builtin and addScript to build the SQLScript ready for execution, 
while for external files it relies on process_file and addScript.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2015-09-02 Thread Andres Freund

On 2015-07-30 18:03:56 +0200, Fabien COELHO wrote:
> 
> >v6 is just a rebase after a bug fix by Andres Freund.
> >
> >Also a small question: The patch currently displays pgbench scripts
> >starting numbering at 0. Probably a little too geek... should start at 1?
> 
> v7 is a rebase after another small bug fix in pgbench.
> 
> -- 
> Fabien.

> diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
> index 2517a3a..99670d4 100644
> --- a/doc/src/sgml/ref/pgbench.sgml
> +++ b/doc/src/sgml/ref/pgbench.sgml
> @@ -261,6 +261,23 @@ pgbench  options  
> dbname
>  benchmarking arguments:
>  
>  
> + 
> +  -b scriptname[@weight]
> +  --builtin scriptname[@weight]
> +  
> +   
> +Add the specified builtin script to the list of executed scripts.
> +An optional integer weight after @ allows to adjust the
> +probability of drawing the test.
> +Available builtin scripts are: tpcb-like,
> +simple-update and select-only.
> +The provided scriptname needs only to be a prefix
> +of the builtin name, hence simp would be enough to select
> +simple-update.
> +   
> +  
> + 

Maybe add --builtin list to show them?

> @@ -404,10 +422,7 @@ pgbench  options  
> dbname
>--skip-some-updates
>
> 
> -Do not update pgbench_tellers and
> -pgbench_branches.
> -This will avoid update contention on these tables, but
> -it makes the test case even less like TPC-B.
> +Shorthand for -b simple-update@1.
> 
>
>   

> @@ -511,7 +526,7 @@ pgbench  options  
> dbname
>--select-only
>
> 
> -Perform select-only transactions instead of TPC-B-like test.
> +Shorthand for -b select-only@1.
> 
>
>   

I'm a bit inclined to remove these options.

>
> -   The default transaction script issues seven commands per transaction:
> +   Pgbench executes test scripts chosen randomly from a specified list.
> +   They include built-in scripts with -b and
> +   user-provided custom scripts with -f.
> +   Each script may be given a relative weight specified after a
> +   @ so as to change its drawing probability.
> +   The default weight is 1.
> + 

I'm wondering if percentages instead of weights would be a better
idea. That'd mean you'd be forced to be more careful when adding another
script (having to adjust the percentages of other scripts) but arguably
that's a good thing?

> +static SQLScript sql_script[MAX_SCRIPTS];
> +static struct {
> + char *name;   /* very short name for -b ...*/
> + char *desc;   /* short description */
> + char *script; /* actual pgbench script */
> +} builtin_script[]

Can't we put these in the same array?

> + printf("transaction type: %s\n",
> +num_scripts == 1? sql_script[0].name: "multiple scripts");

Seems like it'd be more useful to simply always list the scripts +
weights here.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2015-09-02 Thread Robert Haas

On Wed, Sep 2, 2015 at 5:55 PM, Andres Freund  wrote:
> On 2015-09-02 14:36:51 -0400, Robert Haas wrote:
>> On Wed, Sep 2, 2015 at 2:20 PM, Fabien COELHO  wrote:
>> >> I'm wondering if percentages instead of weights would be a better
>> >> idea. That'd mean you'd be forced to be more careful when adding another
>> >> script (having to adjust the percentages of other scripts) but arguably
>> >> that's a good thing?
>> >
>> > If you use only percent, then you have to check that the total is 100,
>> > probably you have to use floats, to do something when the total is not 100,
>> > checking would complicate the code and test people mental calculus
>> > abilities. Not sure this is a good idea:-)
>>
>> I agree.  I don't see a reason to enforce that the total of the
>> weights must be 100.
>
> I'm slightly worried that using weights will be a bit confusing because
> adding another script will obviously reduce the frequency of already
> defined scripts. But it's probably not worth worrying.

That sounds like a feature to me, not a bug.  I wouldn't worry.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2015-09-02 Thread Fabien COELHO



Hello Andres,


Maybe add --builtin list to show them?


Yep, easy enough.


[...]
+Shorthand for -b simple-update@1.
+Shorthand for -b select-only@1.


I'm a bit inclined to remove these options.


Hm...

This is really backward compatibility, and people may find reference to 
these in blogs or elswhere, so I think that it would make sense to

be upward compatible.

I would certainly be against adding any other of these options, though.


+   Each script may be given a relative weight specified after a
+   @ so as to change its drawing probability.
+   The default weight is 1.


I'm wondering if percentages instead of weights would be a better
idea. That'd mean you'd be forced to be more careful when adding another
script (having to adjust the percentages of other scripts) but arguably
that's a good thing?


If you use only percent, then you have to check that the total is 100, 
probably you have to use floats, to do something when the total is not 
100, checking would complicate the code and test people mental calculus 
abilities. Not sure this is a good idea:-)


In the use case you outline, when adding a script, maybe you know that it 
runs "as much as" this other script, so you can pick up the same weight 
without bothering.


Also, when testing, there is an issue when you want to remove one script 
for a quick test, and that would mean changing all percentages on the 
command line...


So I would advise not to put such a constraint.


+static SQLScript sql_script[MAX_SCRIPTS];

+static struct {
+   char *name;   /* very short name for -b ...*/
+   char *desc;   /* short description */
+   char *script; /* actual pgbench script */
+} builtin_script[]


Can't we put these in the same array?


I do not understand.


+   printf("transaction type: %s\n",
+  num_scripts == 1? sql_script[0].name: "multiple scripts");


Seems like it'd be more useful to simply always list the scripts +
weights here.


The detailed list is shown later, with the summary performance figure for 
each scripts, so ISTM that it would be redundant? Maybe the transaction 
type could be moved downwards just before said list?


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2015-09-02 Thread Robert Haas

On Wed, Sep 2, 2015 at 2:20 PM, Fabien COELHO  wrote:
>> I'm wondering if percentages instead of weights would be a better
>> idea. That'd mean you'd be forced to be more careful when adding another
>> script (having to adjust the percentages of other scripts) but arguably
>> that's a good thing?
>
> If you use only percent, then you have to check that the total is 100,
> probably you have to use floats, to do something when the total is not 100,
> checking would complicate the code and test people mental calculus
> abilities. Not sure this is a good idea:-)

I agree.  I don't see a reason to enforce that the total of the
weights must be 100.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff

2015-09-02 Thread Andres Freund

On 2015-09-02 14:36:51 -0400, Robert Haas wrote:
> On Wed, Sep 2, 2015 at 2:20 PM, Fabien COELHO  wrote:
> >> I'm wondering if percentages instead of weights would be a better
> >> idea. That'd mean you'd be forced to be more careful when adding another
> >> script (having to adjust the percentages of other scripts) but arguably
> >> that's a good thing?
> >
> > If you use only percent, then you have to check that the total is 100,
> > probably you have to use floats, to do something when the total is not 100,
> > checking would complicate the code and test people mental calculus
> > abilities. Not sure this is a good idea:-)
> 
> I agree.  I don't see a reason to enforce that the total of the
> weights must be 100.

I'm slightly worried that using weights will be a bit confusing because
adding another script will obviously reduce the frequency of already
defined scripts. But it's probably not worth worrying.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script other stuff

2015-08-17 Thread Fabien COELHO




v7 is a rebase after another small bug fix in pgbench.


v8 is a rebase after yet another small bug fix in pgbench.

--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 2517a3a..99670d4 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -261,6 +261,23 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
 benchmarking arguments:
 
 variablelist
+ varlistentry
+  termoption-b/ replaceablescriptname[@weight]//term
+  termoption--builtin/ replaceablescriptname[@weight]//term
+  listitem
+   para
+Add the specified builtin script to the list of executed scripts.
+An optional integer weight after literal@/ allows to adjust the
+probability of drawing the test.
+Available builtin scripts are: literaltpcb-like/,
+literalsimple-update/ and literalselect-only/.
+The provided repleacablescriptname/ needs only to be a prefix
+of the builtin name, hence literalsimp/ would be enough to select
+literalsimple-update/.
+   /para
+  /listitem
+ /varlistentry
+
 
  varlistentry
   termoption-c/option replaceableclients//term
@@ -307,14 +324,15 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
  /varlistentry
 
  varlistentry
-  termoption-f/option replaceablefilename//term
-  termoption--file=/optionreplaceablefilename//term
+  termoption-f/ replaceablefilename[@weight]//term
+  termoption--file=/replaceablefilename[@weight]//term
   listitem
para
-Read transaction script from replaceablefilename/.
+Add a transaction script read from replaceablefilename/ to
+the list of executed scripts.
+An optional integer weight after literal@/ allows to adjust the
+probability of drawing the test.
 See below for details.
-option-N/option, option-S/option, and option-f/option
-are mutually exclusive.
/para
   /listitem
  /varlistentry
@@ -404,10 +422,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
   termoption--skip-some-updates/option/term
   listitem
para
-Do not update structnamepgbench_tellers/ and
-structnamepgbench_branches/.
-This will avoid update contention on these tables, but
-it makes the test case even less like TPC-B.
+Shorthand for option-b simple-update@1/.
/para
   /listitem
  /varlistentry
@@ -499,9 +514,9 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
 Report the specified scale factor in applicationpgbench/'s
 output.  With the built-in tests, this is not necessary; the
 correct scale factor will be detected by counting the number of
-rows in the structnamepgbench_branches/ table.  However, when testing
-custom benchmarks (option-f/ option), the scale factor
-will be reported as 1 unless this option is used.
+rows in the structnamepgbench_branches/ table.
+However, when testing only custom benchmarks (option-f/ option),
+the scale factor will be reported as 1 unless this option is used.
/para
   /listitem
  /varlistentry
@@ -511,7 +526,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
   termoption--select-only/option/term
   listitem
para
-Perform select-only transactions instead of TPC-B-like test.
+Shorthand for option-b select-only@1/.
/para
   /listitem
  /varlistentry
@@ -567,6 +582,16 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
  /varlistentry
 
  varlistentry
+  termoption--per-script-stats/option/term
+  listitem
+   para
+Report some statistics per script run by pgbench.
+   /para
+  /listitem
+ /varlistentry
+
+
+ varlistentry
   termoption--sampling-rate=replaceablerate//option/term
   listitem
para
@@ -661,7 +686,20 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
   titleWhat is the quoteTransaction/ Actually Performed in pgbench?/title
 
   para
-   The default transaction script issues seven commands per transaction:
+   Pgbench executes test scripts chosen randomly from a specified list.
+   They include built-in scripts with option-b/ and
+   user-provided custom scripts with option-f/.
+   Each script may be given a relative weight specified after a
+   literal@/ so as to change its drawing probability.
+   The default weight is literal1/.
+ /para
+
+  para
+   The default builtin transaction script (also invoked with option-b tpcb-like/)
+   issues seven commands per transaction over randomly chosen literalaid/,
+   literaltid/, literalbid/ and literalbalance/.
+   The scenario is inspired by the TPC-B benchmark, but is not actually TPC-B,
+   hence the name.
   /para
 
   orderedlist
@@ -675,9

Re: [HACKERS] pgbench stats per script other stuff

2015-07-30 Thread Fabien COELHO




v6 is just a rebase after a bug fix by Andres Freund.

Also a small question: The patch currently displays pgbench scripts 
starting numbering at 0. Probably a little too geek... should start at 
1?


v7 is a rebase after another small bug fix in pgbench.

--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 2517a3a..99670d4 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -261,6 +261,23 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
 benchmarking arguments:
 
 variablelist
+ varlistentry
+  termoption-b/ replaceablescriptname[@weight]//term
+  termoption--builtin/ replaceablescriptname[@weight]//term
+  listitem
+   para
+Add the specified builtin script to the list of executed scripts.
+An optional integer weight after literal@/ allows to adjust the
+probability of drawing the test.
+Available builtin scripts are: literaltpcb-like/,
+literalsimple-update/ and literalselect-only/.
+The provided repleacablescriptname/ needs only to be a prefix
+of the builtin name, hence literalsimp/ would be enough to select
+literalsimple-update/.
+   /para
+  /listitem
+ /varlistentry
+
 
  varlistentry
   termoption-c/option replaceableclients//term
@@ -307,14 +324,15 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
  /varlistentry
 
  varlistentry
-  termoption-f/option replaceablefilename//term
-  termoption--file=/optionreplaceablefilename//term
+  termoption-f/ replaceablefilename[@weight]//term
+  termoption--file=/replaceablefilename[@weight]//term
   listitem
para
-Read transaction script from replaceablefilename/.
+Add a transaction script read from replaceablefilename/ to
+the list of executed scripts.
+An optional integer weight after literal@/ allows to adjust the
+probability of drawing the test.
 See below for details.
-option-N/option, option-S/option, and option-f/option
-are mutually exclusive.
/para
   /listitem
  /varlistentry
@@ -404,10 +422,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
   termoption--skip-some-updates/option/term
   listitem
para
-Do not update structnamepgbench_tellers/ and
-structnamepgbench_branches/.
-This will avoid update contention on these tables, but
-it makes the test case even less like TPC-B.
+Shorthand for option-b simple-update@1/.
/para
   /listitem
  /varlistentry
@@ -499,9 +514,9 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
 Report the specified scale factor in applicationpgbench/'s
 output.  With the built-in tests, this is not necessary; the
 correct scale factor will be detected by counting the number of
-rows in the structnamepgbench_branches/ table.  However, when testing
-custom benchmarks (option-f/ option), the scale factor
-will be reported as 1 unless this option is used.
+rows in the structnamepgbench_branches/ table.
+However, when testing only custom benchmarks (option-f/ option),
+the scale factor will be reported as 1 unless this option is used.
/para
   /listitem
  /varlistentry
@@ -511,7 +526,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
   termoption--select-only/option/term
   listitem
para
-Perform select-only transactions instead of TPC-B-like test.
+Shorthand for option-b select-only@1/.
/para
   /listitem
  /varlistentry
@@ -567,6 +582,16 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
  /varlistentry
 
  varlistentry
+  termoption--per-script-stats/option/term
+  listitem
+   para
+Report some statistics per script run by pgbench.
+   /para
+  /listitem
+ /varlistentry
+
+
+ varlistentry
   termoption--sampling-rate=replaceablerate//option/term
   listitem
para
@@ -661,7 +686,20 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
   titleWhat is the quoteTransaction/ Actually Performed in pgbench?/title
 
   para
-   The default transaction script issues seven commands per transaction:
+   Pgbench executes test scripts chosen randomly from a specified list.
+   They include built-in scripts with option-b/ and
+   user-provided custom scripts with option-f/.
+   Each script may be given a relative weight specified after a
+   literal@/ so as to change its drawing probability.
+   The default weight is literal1/.
+ /para
+
+  para
+   The default builtin transaction script (also invoked with option-b tpcb-like/)
+   issues seven commands per transaction over randomly chosen literalaid/,
+   literaltid/, literalbid/ and

Re: [HACKERS] pgbench stats per script other stuff

2015-07-28 Thread Fabien COELHO



v6 is just a rebase after a bug fix by Andres Freund.

Also a small question: The patch currently displays pgbench scripts 
starting numbering at 0. Probably a little too geek... should start at 1?


--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 2517a3a..99670d4 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -261,6 +261,23 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
 benchmarking arguments:
 
 variablelist
+ varlistentry
+  termoption-b/ replaceablescriptname[@weight]//term
+  termoption--builtin/ replaceablescriptname[@weight]//term
+  listitem
+   para
+Add the specified builtin script to the list of executed scripts.
+An optional integer weight after literal@/ allows to adjust the
+probability of drawing the test.
+Available builtin scripts are: literaltpcb-like/,
+literalsimple-update/ and literalselect-only/.
+The provided repleacablescriptname/ needs only to be a prefix
+of the builtin name, hence literalsimp/ would be enough to select
+literalsimple-update/.
+   /para
+  /listitem
+ /varlistentry
+
 
  varlistentry
   termoption-c/option replaceableclients//term
@@ -307,14 +324,15 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
  /varlistentry
 
  varlistentry
-  termoption-f/option replaceablefilename//term
-  termoption--file=/optionreplaceablefilename//term
+  termoption-f/ replaceablefilename[@weight]//term
+  termoption--file=/replaceablefilename[@weight]//term
   listitem
para
-Read transaction script from replaceablefilename/.
+Add a transaction script read from replaceablefilename/ to
+the list of executed scripts.
+An optional integer weight after literal@/ allows to adjust the
+probability of drawing the test.
 See below for details.
-option-N/option, option-S/option, and option-f/option
-are mutually exclusive.
/para
   /listitem
  /varlistentry
@@ -404,10 +422,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
   termoption--skip-some-updates/option/term
   listitem
para
-Do not update structnamepgbench_tellers/ and
-structnamepgbench_branches/.
-This will avoid update contention on these tables, but
-it makes the test case even less like TPC-B.
+Shorthand for option-b simple-update@1/.
/para
   /listitem
  /varlistentry
@@ -499,9 +514,9 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
 Report the specified scale factor in applicationpgbench/'s
 output.  With the built-in tests, this is not necessary; the
 correct scale factor will be detected by counting the number of
-rows in the structnamepgbench_branches/ table.  However, when testing
-custom benchmarks (option-f/ option), the scale factor
-will be reported as 1 unless this option is used.
+rows in the structnamepgbench_branches/ table.
+However, when testing only custom benchmarks (option-f/ option),
+the scale factor will be reported as 1 unless this option is used.
/para
   /listitem
  /varlistentry
@@ -511,7 +526,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
   termoption--select-only/option/term
   listitem
para
-Perform select-only transactions instead of TPC-B-like test.
+Shorthand for option-b select-only@1/.
/para
   /listitem
  /varlistentry
@@ -567,6 +582,16 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
  /varlistentry
 
  varlistentry
+  termoption--per-script-stats/option/term
+  listitem
+   para
+Report some statistics per script run by pgbench.
+   /para
+  /listitem
+ /varlistentry
+
+
+ varlistentry
   termoption--sampling-rate=replaceablerate//option/term
   listitem
para
@@ -661,7 +686,20 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
   titleWhat is the quoteTransaction/ Actually Performed in pgbench?/title
 
   para
-   The default transaction script issues seven commands per transaction:
+   Pgbench executes test scripts chosen randomly from a specified list.
+   They include built-in scripts with option-b/ and
+   user-provided custom scripts with option-f/.
+   Each script may be given a relative weight specified after a
+   literal@/ so as to change its drawing probability.
+   The default weight is literal1/.
+ /para
+
+  para
+   The default builtin transaction script (also invoked with option-b tpcb-like/)
+   issues seven commands per transaction over randomly chosen literalaid/,
+   literaltid/, literalbid/ and literalbalance/.
+   The scenario is inspired by the TPC-B benchmark,

Re: [HACKERS] pgbench stats per script other stuff

2015-07-25 Thread Fabien COELHO




I liked @ because it makes sense to read it as the word at.


Yep, why not.

Prepending classic to the names does not look necessary. I would 
suggest tpcb-like, simple-update  select-only, or even maybe any 
prefix. If the bench scripts could be read from some pg directory 
instead of being actually inlined, even more code could be dropped from 
pgbench.


I think including classic would be a very good idea.


Hmm. This is the command line, you have to type them! With a prefix-based
approach this suggests that the builtin names must start differently so as 
to be easily selected.


We might want to add a TPC-C like workload in the future, or any number 
of other things.  Naming things in a good way from the outset can only 
make that easier.


Here is a v4 which:

 - removes -w stuff

 - enhance -f with @weight

 - adds -b/--builtin name@weight, based on prefix

   builtin names are: tpcb-like, simple-update  select-only,
   which matches their more or less historical names
   (although I wasn't sure of tpcb-sort-of, so I put tpcb-like)

 - removes -B (now can be accessed with -b tpcb-like)

Pgbench builtin scripts are still inlined in the code, not in a separate 
directory, which might be an option to simplify the code and allow easy 
extensions.


I still think that the --per-script-stats option is useless and per 
script stats should always be on as soon as several scripts are running.


Even more, I think that stats (maybe no per-command stat though) should
always be collected. The point of pgbench is to collect data, and the
basic end-of-run tps summary is very terse and does not reflect much
of what happened during the run.

Also, maybe per-command detailed stats should use the same common struct 
to hold data as all other stats. I did not change it because it is 
maintained in a different part of the code.


--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 2517a3a..dc36f5d 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -260,7 +260,22 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
 applicationpgbench/application accepts the following command-line
 benchmarking arguments:
 
-variablelist
+  termoption-b/ replaceablescriptname[@weight]//term
+  termoption--builtin/ replaceablescriptname[@weight]//term
+  listitem
+   para
+Add the specified builtin script to the list of executed scripts.
+An optional integer weight after literal@/ allows to adjust the
+probability of drawing the test.
+Available builtin scripts are: literaltpcb-like/,
+literalsimple-update/ and literalselect-only/.
+The provided repleacablescriptname/ needs only to be a prefix
+of the builtin name, hence literalsimp/ would be enough to select
+literalsimple-update/.
+   /para
+  /listitem
+ /varlistentry
+
 
  varlistentry
   termoption-c/option replaceableclients//term
@@ -307,14 +322,15 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
  /varlistentry
 
  varlistentry
-  termoption-f/option replaceablefilename//term
-  termoption--file=/optionreplaceablefilename//term
+  termoption-f/ replaceablefilename[@weight]//term
+  termoption--file=/replaceablefilename[@weight]//term
   listitem
para
-Read transaction script from replaceablefilename/.
+Add a transaction script read from replaceablefilename/ to
+the list of executed scripts.
+An optional integer weight after literal@/ allows to adjust the
+probability of drawing the test.
 See below for details.
-option-N/option, option-S/option, and option-f/option
-are mutually exclusive.
/para
   /listitem
  /varlistentry
@@ -404,10 +420,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
   termoption--skip-some-updates/option/term
   listitem
para
-Do not update structnamepgbench_tellers/ and
-structnamepgbench_branches/.
-This will avoid update contention on these tables, but
-it makes the test case even less like TPC-B.
+Shorthand for option-b simple-update@1/.
/para
   /listitem
  /varlistentry
@@ -499,9 +512,9 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
 Report the specified scale factor in applicationpgbench/'s
 output.  With the built-in tests, this is not necessary; the
 correct scale factor will be detected by counting the number of
-rows in the structnamepgbench_branches/ table.  However, when testing
-custom benchmarks (option-f/ option), the scale factor
-will be reported as 1 unless this option is used.
+rows in the structnamepgbench_branches/ table.
+However, when testing only custom benchmarks (option-f/ option),
+the scale factor will be reported

Re: [HACKERS] pgbench stats per script other stuff

2015-07-25 Thread Fabien COELHO



Also, maybe per-command detailed stats should use the same common struct 
to hold data as all other stats. I did not change it because it is 
maintained in a different part of the code.


I played just once with the --report-latencies option and was astonished 
that meta commands showed negative latencies...


This v5 also fixes this bug (on meta commands there is a goto loop in 
doCustom, but as now was not reset the stmt_begin ended up being after 
now, hence accumulating increasing negative times) and in passing uses the 
same stats structure as the rest, which result in removing some more code. 
The report-latencies option is made to imply per script stats, which 
simplifies the final output code, and if you want per-command per-script 
stats, probably providing the per-script stats, i.e. the sum of the 
commands, make sense.


--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 2517a3a..dc36f5d 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -260,7 +260,22 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
 applicationpgbench/application accepts the following command-line
 benchmarking arguments:
 
-variablelist
+  termoption-b/ replaceablescriptname[@weight]//term
+  termoption--builtin/ replaceablescriptname[@weight]//term
+  listitem
+   para
+Add the specified builtin script to the list of executed scripts.
+An optional integer weight after literal@/ allows to adjust the
+probability of drawing the test.
+Available builtin scripts are: literaltpcb-like/,
+literalsimple-update/ and literalselect-only/.
+The provided repleacablescriptname/ needs only to be a prefix
+of the builtin name, hence literalsimp/ would be enough to select
+literalsimple-update/.
+   /para
+  /listitem
+ /varlistentry
+
 
  varlistentry
   termoption-c/option replaceableclients//term
@@ -307,14 +322,15 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
  /varlistentry
 
  varlistentry
-  termoption-f/option replaceablefilename//term
-  termoption--file=/optionreplaceablefilename//term
+  termoption-f/ replaceablefilename[@weight]//term
+  termoption--file=/replaceablefilename[@weight]//term
   listitem
para
-Read transaction script from replaceablefilename/.
+Add a transaction script read from replaceablefilename/ to
+the list of executed scripts.
+An optional integer weight after literal@/ allows to adjust the
+probability of drawing the test.
 See below for details.
-option-N/option, option-S/option, and option-f/option
-are mutually exclusive.
/para
   /listitem
  /varlistentry
@@ -404,10 +420,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
   termoption--skip-some-updates/option/term
   listitem
para
-Do not update structnamepgbench_tellers/ and
-structnamepgbench_branches/.
-This will avoid update contention on these tables, but
-it makes the test case even less like TPC-B.
+Shorthand for option-b simple-update@1/.
/para
   /listitem
  /varlistentry
@@ -499,9 +512,9 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
 Report the specified scale factor in applicationpgbench/'s
 output.  With the built-in tests, this is not necessary; the
 correct scale factor will be detected by counting the number of
-rows in the structnamepgbench_branches/ table.  However, when testing
-custom benchmarks (option-f/ option), the scale factor
-will be reported as 1 unless this option is used.
+rows in the structnamepgbench_branches/ table.
+However, when testing only custom benchmarks (option-f/ option),
+the scale factor will be reported as 1 unless this option is used.
/para
   /listitem
  /varlistentry
@@ -511,7 +524,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
   termoption--select-only/option/term
   listitem
para
-Perform select-only transactions instead of TPC-B-like test.
+Shorthand for option-b select-only@1/.
/para
   /listitem
  /varlistentry
@@ -567,6 +580,16 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
  /varlistentry
 
  varlistentry
+  termoption--per-script-stats/option/term
+  listitem
+   para
+Report some statistics per script run by pgbench.
+   /para
+  /listitem
+ /varlistentry
+
+
+ varlistentry
   termoption--sampling-rate=replaceablerate//option/term
   listitem
para
@@ -661,7 +684,20 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
   titleWhat is the quoteTransaction/ Actually Performed in pgbench?/title

Re: [HACKERS] pgbench stats per script other stuff

2015-07-23 Thread Robert Haas

On Wed, Jul 22, 2015 at 1:54 PM, Josh Berkus j...@agliodbs.com wrote:
 If so, I would vote for:

 -f script1.bench:3 -f script2.bench:1

 over:

 -f script1.bench -w 3 -f script2.bench -w 1

 Making command-line options order-dependant breaks a lot of system call
 libraries in various languages, as well as being easy to mess up.

Yes, I think that's a good idea.  I don't know whether : is the right
separator; I kind of line @.  But that's bikeshedding.

As Fabien mentions further downthread, it would be nice to set weights
for the built-ins.  I'd actually like to introduce a new pgbench
option that selects a builtin script by name, so that we can have more
than three of them without running out of option names (or going
insane).  So suppose we introduce pgbench -b BUILTIN_NAME, where
BUILTIN_NAME is initially one of these:

classic
classic-simple-update
classic-select-only

Then you can do pgbench -b classic@1 -b classic-select-only@9 or
similar to get 10% write, 90% read.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script other stuff

2015-07-23 Thread Fabien COELHO




Yes, I think that's a good idea.  I don't know whether : is the right
separator; I kind of line @.  But that's bikeshedding.


Possible ASCII contenders should avoid shell and filename interaction, 
which exclude * ? !  /   [ ] . - $ and so on: those that seem to 
remain are @ , = : % # +. I like % because this is about sharing, 
although this is not a percentage.


I'd actually like to introduce a new pgbench option that selects a 
builtin script by name, so that we can have more than three of them 
without running out of option names (or going insane).  So suppose we 
introduce pgbench -b BUILTIN_NAME, where BUILTIN_NAME is initially one 
of these:

classic, classic-simple-update, classic-select-only

Then you can do pgbench -b classic@1 -b classic-select-only@9 or
similar to get 10% write, 90% read.


I like this idea, as -b/-f would be symmetric. Prepending classic to the 
names does not look necessary. I would suggest tpcb-like, 
simple-update  select-only, or even maybe any prefix. If the bench 
scripts could be read from some pg directory instead of being actually 
inlined, even more code could be dropped from pgbench.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script other stuff

2015-07-23 Thread Alvaro Herrera

Fabien COELHO wrote:
 
 [...] and that a subsequent -w modifies the meaning of the
 script-specifiying argument already read. That strikes me as a very
 unintuitive interface.
 
 Ok, I understand this afterward modification objection.
 
 What if the -w would be required *before*, and supply a weight for (the
 first/maybe all) script(s) specified *afterwards*, so it does not modify
 something already provided? I think it would be more intuitive, or at
 least less surprising.
 
 Here is a v3 which does that. If there is a better idea, do not hesitate!

This seems a moderately reasonable interface to me.  There are other
programs that behave in that way, and once you get used to the idea, it
makes sense.

I think for complete consistency we would have to require that -w is
specified for all scripts or none of them.  I am not sure if this means
that it's okay to have later scripts use a weight specified for a
previous one (i.e. it's only an error to fail to specify a weight for
options before the first -w), or each -f must have always its own -w
explicitely.  In other words,
pg_bench -w2 -f script1.sql -f script2.sql
either script2 has weight 2, or it's an error, depending on what we
decide; but
pg_bench -f script1.sql -w 2 -fscript2.sql
is always an error.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script other stuff

2015-07-22 Thread Josh Berkus

On 07/21/2015 10:25 PM, Fabien COELHO wrote:
 
 Hello Josh,
 
 Maybe -f file.sql:weight (yuk from my point of view, but it can be done
 easily).

 Maybe it's past time for pgbench to have a config file?
 
 That is an idea.  For simple usage, for backward compatibility and for
 people like me who like them, ISTM that options are fine too:-)
 
 Also this may mean adding a dependency to some YAML library, configure
 issues (I'm not sure whether pg currently uses YAML, and JSON is quite
 verbose), maybe conditionals around the feature to compile without the
 dependency, more documentation...
 
 I'm not sure all that is desirable just for weighting scripts.

Maybe not.

If so, I would vote for:

-f script1.bench:3 -f script2.bench:1

over:

-f script1.bench -w 3 -f script2.bench -w 1

Making command-line options order-dependant breaks a lot of system call
libraries in various languages, as well as being easy to mess up.

 Given that we want to define some per-workload options, the config file
 would probably need to be YAML or JSON, e.g.:

 [...]

 script1:
 file: script1.bench
 weight: 3
 script2:
 file: script2.bench
 weight: 1

 the above would execute a pgbench with 16 clients, 4 threads, script1
 three times as often as script2, and report stats at the script (rather
 than SQL statement) level.
 
 Yep. Probably numbering within field names should be avoided, so a list
 of records that could look like:

Oh, you misunderstand.  script1 and script2 are meant to be
user-supplied names which then get reported in things like response time
output.  They're labels. Better example:

deposit:
file: deposit.bench
weight: 3
withdrawal:
file: withdrawal.bench
weight: 3
reporting:
file: summary_report.bench
weigh: 1

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script other stuff

2015-07-22 Thread Fabien COELHO




If so, I would vote for:
  -f script1.bench:3 -f script2.bench:1
over:
  -f script1.bench -w 3 -f script2.bench -w 1


Ok, I'll take that into consideration. Any other opinion out there? The 
current v3 version is:


  -w 3 -f script1.bench -w 1 -f script2.bench

With provision to generate errors if a -w is set but not used,
in two case.

 - in the middle ... -w 4 no script option... -w 1 ...
 - in the end ... -w 1 no script option...

I can provide -f x:weight easilly, but this mean that there will be no way 
to associate weight for internal scripts. Not orthogonal, not very 
elegant, but no big deal.


Oh, you misunderstand.  script1 and script2 are meant to be 
user-supplied names which then get reported in things like response time 
output.  They're labels.


Ok, that is much better. This means that labels should not choose names 
which may interact with other commands, so maybe a list would have been 
nice as well. Anyway, I do not think it is the way to go just for this 
feature.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script other stuff

2015-07-22 Thread Andres Freund

On 2015-07-22 10:54:14 -0700, Josh Berkus wrote:
 Making command-line options order-dependant breaks a lot of system call
 libraries in various languages, as well as being easy to mess up.

What?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script other stuff

2015-07-21 Thread Robert Haas

On Tue, Jul 21, 2015 at 10:42 AM, Fabien COELHO coe...@cri.ensmp.fr wrote:
   sh ./pgbench -T 3 -B -N -w 2 -S -w 7 --per-script-stats

 That is a truly horrifying abuse of command-line arguments.  -1 from
 me, or minus more than one if I've got that many chits to burn.

 Are you against the -w, or against saying that pgbench execute scripts,
 whether internal or from files?

I'm against the idea that we accept multiple arguments for scripts,
and that a subsequent -w modifies the meaning of the
script-specifiying argument already read.  That strikes me as a very
unintuitive interface.  I'm not sure exactly what would be better at
the moment, but I think we need something better.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script other stuff

2015-07-21 Thread Fabien COELHO


5~5~5~

That is a truly horrifying abuse of command-line arguments.  -1 from
me, or minus more than one if I've got that many chits to burn.


Are you against the -w, or against saying that pgbench execute scripts,
whether internal or from files?


I'm against the idea that we accept multiple arguments for scripts,


Pgbench *currently* already accept multiple -f ... options, and this is 
a good thing to test realistic loads which may intermix several kind of 
transactions, say a lot of readonly and some update or insert, and very 
rare deletes...


Now if you do not need it you do not use it, and all is fine. Once you 
have several scripts, being able to weight them becomes useful for 
realism.


and that a subsequent -w modifies the meaning of the script-specifiying 
argument already read. That strikes me as a very unintuitive interface.


Ok, I understand this afterward modification objection.

What if the -w would be required *before*, and supply a weight for (the 
first/maybe all) script(s) specified *afterwards*, so it does not modify 
something already provided? I think it would be more intuitive, or at 
least less surprising.


I'm not sure exactly what would be better at the moment, but I think we 
need something better.


Maybe -f file.sql:weight (yuk from my point of view, but it can be 
done easily).


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script other stuff

2015-07-21 Thread Fabien COELHO



Hello Robert,


Pgbench *currently* already accept multiple -f ... options, and this is a
good thing to test realistic loads which may intermix several kind of
transactions, say a lot of readonly and some update or insert, and very rare
deletes...


Hmm, I didn't realize that.  The code looks a bit inconsistent right
now - e.g. we do support multiple files, but pgbench's options-parsing
loop sets ttype to a value that depends only on the last of -f, -N,
and -S encountered.


Indeed. However as with current pgbench nothing/-N/-S and -f are 
mutually exclusive it is ok to have ttype set as it is.


With the patch pgbench just executes scripts and the options are not 
mutually exclusive: some scripts are internal and others are not, but they 
are treated the same beyond initialization, which helps removing some code 
including the ttype variable you mention. The name of the script is kept 
in an SQLScript struct along with its commands, weight and stats.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script other stuff

2015-07-21 Thread Fabien COELHO



[...] and that a subsequent -w modifies the meaning of the 
script-specifiying argument already read. That strikes me as a very 
unintuitive interface.


Ok, I understand this afterward modification objection.

What if the -w would be required *before*, and supply a weight for (the 
first/maybe all) script(s) specified *afterwards*, so it does not modify 
something already provided? I think it would be more intuitive, or at least 
less surprising.


Here is a v3 which does that. If there is a better idea, do not hesitate!

 sh ./pgbench -w 9 -f one.sql -f now.sql -T 2 -P 1 --per-script-stats
 starting vacuum...end.
 progress: 1.0 s, 24536.0 tps, lat 0.039 ms stddev 0.024
 progress: 2.0 s, 25963.8 tps, lat 0.038 ms stddev 0.015
 transaction type: multiple scripts
 scaling factor: 1
 query mode: simple
 number of clients: 1
 number of threads: 1
 duration: 2 s
 number of transactions actually processed: 50501
 latency average = 0.039 ms
 latency stddev = 0.020 ms
 tps = 25249.464772 (including connections establishing)
 tps = 25339.454154 (excluding connections establishing)
 SQL script 0, weight 9: one.sql
  - 45366 transactions (89.8% of total, tps = 22682.070035)
  - latency average = 0.038 ms
  - latency stddev = 0.016 ms
 SQL script 1, weight 1: now.sql
  - 5135 transactions (10.2% of total, tps = 2567.394737)
  - latency average = 0.044 ms
  - latency stddev = 0.041 ms

--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 2517a3a..6bac511 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -261,6 +261,18 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
 benchmarking arguments:
 
 variablelist
+ varlistentry
+  termoption-B/option/term
+  termoption--tpc-b/option/term
+  listitem
+   para
+Built-in TPC-B like test which updates three tables and performs
+an insert on a fourth. See below for details.
+This is the default if no bench is explicitely specified.
+   /para
+  /listitem
+ /varlistentry
+
 
  varlistentry
   termoption-c/option replaceableclients//term
@@ -313,8 +325,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
para
 Read transaction script from replaceablefilename/.
 See below for details.
-option-N/option, option-S/option, and option-f/option
-are mutually exclusive.
+Multiple option-f/ options are allowed.
/para
   /listitem
  /varlistentry
@@ -404,10 +415,10 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
   termoption--skip-some-updates/option/term
   listitem
para
-Do not update structnamepgbench_tellers/ and
-structnamepgbench_branches/.
-This will avoid update contention on these tables, but
-it makes the test case even less like TPC-B.
+Built-in test which updates only one table compared to option-B/.
+This avoids update contention on structnamepgbench_tellers/
+and structnamepgbench_branches/, but it makes the test case
+even less like TPC-B.
/para
   /listitem
  /varlistentry
@@ -499,9 +510,9 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
 Report the specified scale factor in applicationpgbench/'s
 output.  With the built-in tests, this is not necessary; the
 correct scale factor will be detected by counting the number of
-rows in the structnamepgbench_branches/ table.  However, when testing
-custom benchmarks (option-f/ option), the scale factor
-will be reported as 1 unless this option is used.
+rows in the structnamepgbench_branches/ table.
+However, when testing only custom benchmarks (option-f/ option),
+the scale factor will be reported as 1 unless this option is used.
/para
   /listitem
  /varlistentry
@@ -511,7 +522,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
   termoption--select-only/option/term
   listitem
para
-Perform select-only transactions instead of TPC-B-like test.
+Built-in test with select-only transactions.
/para
   /listitem
  /varlistentry
@@ -552,6 +563,20 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
  /varlistentry
 
  varlistentry
+  termoption-w/option/term
+  termoption--weight/option/term
+  listitem
+   para
+Set the integer weight for the next test script
+(option-B/, option-N/, option-S/ or option-f/).
+When several test scripts are used, the relative probability of
+drawing each test is proportional to these weights.
+The default weight is literal1/.
+   /para
+  /listitem
+ /varlistentry
+
+ varlistentry
   termoption--aggregate-interval=replaceableseconds//option/term
   listitem
para
@@

Re: [HACKERS] pgbench stats per script other stuff

2015-07-21 Thread Robert Haas

On Tue, Jul 21, 2015 at 12:29 PM, Fabien COELHO coe...@cri.ensmp.fr wrote:
 Pgbench *currently* already accept multiple -f ... options, and this is a
 good thing to test realistic loads which may intermix several kind of
 transactions, say a lot of readonly and some update or insert, and very rare
 deletes...

Hmm, I didn't realize that.  The code looks a bit inconsistent right
now - e.g. we do support multiple files, but pgbench's options-parsing
loop sets ttype to a value that depends only on the last of -f, -N,
and -S encountered.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script other stuff

2015-07-21 Thread Josh Berkus

On 07/21/2015 09:29 AM, Fabien COELHO wrote:
 Maybe -f file.sql:weight (yuk from my point of view, but it can be done
 easily).

Maybe it's past time for pgbench to have a config file?

Given that we want to define some per-workload options, the config file
would probably need to be YAML or JSON, e.g.:

pgbench --config=workload1.pgb

workload1.pgb
-

database: bench
port: 5432
host: localhost
user: josh
clients : 16
threads : 4
response-times : on
stats-level: script
script1:
file: script1.bench
weight: 3
script2:
file: script2.bench
weight: 1

the above would execute a pgbench with 16 clients, 4 threads, script1
three times as often as script2, and report stats at the script (rather
than SQL statement) level.




-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script other stuff

2015-07-21 Thread Robert Haas

On Fri, Jul 17, 2015 at 9:50 AM, Fabien coe...@cri.ensmp.fr wrote:
   sh ./pgbench -T 3 -B -N -w 2 -S -w 7 --per-script-stats

That is a truly horrifying abuse of command-line arguments.  -1 from
me, or minus more than one if I've got that many chits to burn.

I have been thinking that the way to do this is to push more into the
script file itself, e.g. allow:

\if random()  0.1
stuff
\else
other stuff
\endif

Maybe that's overkill and there's some way of specifying multiple
scripts on the command line, but IMO what you've got here is not it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script other stuff

2015-07-21 Thread Fabien COELHO



Hello Robert,


On Fri, Jul 17, 2015 at 9:50 AM, Fabien coe...@cri.ensmp.fr wrote:

  sh ./pgbench -T 3 -B -N -w 2 -S -w 7 --per-script-stats


That is a truly horrifying abuse of command-line arguments.  -1 from
me, or minus more than one if I've got that many chits to burn.


Are you against the -w, or against saying that pgbench execute scripts, 
whether internal or from files?


The former is obviously a matter of taste and I can remove -w if nobody 
wants it, too bad because the feature seems useful to me from a testing 
point of view, this is a choice between aesthetic and feature. Note that 
you do not have to use it if you do not like it.


The later really homogeneise the code internally and allows to factor out 
things, to have orthogonal features (internal scripts are treated the same 
way as external files, this requires less lines of code because it is 
simpler), and does not harm anyone IMO, so it would be sad to let it go.



I have been thinking that the way to do this is to push more into the
script file itself, e.g. allow:

\if random()  0.1
stuff
\else
other stuff
\endif

Maybe that's overkill and there's some way of specifying multiple
scripts on the command line, but IMO what you've got here is not it.


I think that is overkill, and moreover it is not useful: the point is to 
collect statistics *per scripts*, with an random if you would not know 
which part was executed, so you would loose the whole point of having per 
script stats.


If you have another suggestion about how to provide weights, which does 
not rely on ifs nor on options? Maybe a special comment in the script (yuk 
from my point of view because the script would carry its weight whereas I 
think this should be orthogonal to the script contents, but it would be 
better than nothing..).


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script other stuff

2015-07-18 Thread Fabien COELHO



Oops, as usual I forgot something...

This v2 removes old stats code that was put in comment and simplify the 
logic when counting lag times, as they are now taken into account at the 
end of the transaction instead of at the beginning.



This patch adds per-script statistics  other improvements to pgbench

Rationale: Josh asked for the per-script stats:-)

Some restructuring is done so that all stats (-l --aggregate-interval 
--progress --per-script-stats, latency  lag...) share the same structures 
and functions to accumulate data. This limits a lot the growth of pgbench 
from this patch (+17 lines).


In passing, remove the distinction between internal and external scripts.
Pgbench just execute scripts, some of them may be internal...

As a side effect, all scripts can be accumulated pgbench -B -N -S -f ... 
would execute 4 scripts, 3 of which internal (tpc-b, simple-update, 
select-only and another externally supplied one).


Also add a weight option to change the probability of choosing some scripts
when several are available.

Hmmm... Not sure that the --per-script-stats option is really useful. The 
stats could always be shown when several scripts are executed?


 sh ./pgbench -T 3 -B -N -w 2 -S -w 7 --per-script-stats
 starting vacuum...end.
 transaction type: multiple scripts
 scaling factor: 1
 query mode: simple
 number of clients: 1
 number of threads: 1
 duration: 3 s
 number of transactions actually processed: 3192
 latency average: 0.940 ms
 tps = 1063.756045 (including connections establishing)
 tps = 1065.412737 (excluding connections establishing)
 SQL script 0: builtin: TPC-B (sort of)
  - weight is 1
  - 297 transactions (tps = 98.977301)
  - latency average = 3.001 ms
  - latency stddev = 1.320 ms
 SQL script 1: builtin: simple update
  - weight is 2
  - 621 transactions (tps = 206.952539)
  - latency average = 2.506 ms
  - latency stddev = 1.194 ms
 SQL script 2: builtin: select only
  - weight is 7
  - 2274 transactions (tps = 757.826205)
  - latency average = 0.236 ms
  - latency stddev = 0.083 ms




--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 2517a3a..eb571a8 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -261,6 +261,18 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
 benchmarking arguments:
 
 variablelist
+ varlistentry
+  termoption-B/option/term
+  termoption--tpc-b/option/term
+  listitem
+   para
+Built-in TPC-B like test which updates three tables and performs
+an insert on a fourth. See below for details.
+This is the default if no bench is explicitely specified.
+   /para
+  /listitem
+ /varlistentry
+
 
  varlistentry
   termoption-c/option replaceableclients//term
@@ -313,8 +325,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
para
 Read transaction script from replaceablefilename/.
 See below for details.
-option-N/option, option-S/option, and option-f/option
-are mutually exclusive.
+Multiple option-f/ options are allowed.
/para
   /listitem
  /varlistentry
@@ -404,10 +415,10 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
   termoption--skip-some-updates/option/term
   listitem
para
-Do not update structnamepgbench_tellers/ and
-structnamepgbench_branches/.
-This will avoid update contention on these tables, but
-it makes the test case even less like TPC-B.
+Built-in test which updates only one table compared to option-B/.
+This avoids update contention on structnamepgbench_tellers/
+and structnamepgbench_branches/, but it makes the test case
+even less like TPC-B.
/para
   /listitem
  /varlistentry
@@ -499,9 +510,9 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
 Report the specified scale factor in applicationpgbench/'s
 output.  With the built-in tests, this is not necessary; the
 correct scale factor will be detected by counting the number of
-rows in the structnamepgbench_branches/ table.  However, when testing
-custom benchmarks (option-f/ option), the scale factor
-will be reported as 1 unless this option is used.
+rows in the structnamepgbench_branches/ table.
+However, when testing only custom benchmarks (option-f/ option),
+the scale factor will be reported as 1 unless this option is used.
/para
   /listitem
  /varlistentry
@@ -511,7 +522,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
   termoption--select-only/option/term
   listitem
para
-Perform select-only transactions instead of TPC-B-like test.
+Built-in test with select-only transactions.
/para
   /listitem
  /varlistentry
@@ -552,6 +563,20 @@

86 matches

Mail list logo