### Re: [HACKERS] extend pgbench expressions with functions

On Wed, Mar 30, 2016 at 1:23 AM, Fabien COELHOwrote: > > Hello Robert, > >>> If we don't nuke it, it'll never die. >> >> >> Hearing no objections, BOOM. > > > FIZZ! :-) > > Thanks for the commits, and apology for the portability bugs. Thanks for the additions, Fabien. This has been a long trip. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

Hello Robert, If we don't nuke it, it'll never die. Hearing no objections, BOOM. FIZZ! :-) Thanks for the commits, and apology for the portability bugs. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

On Mon, Mar 28, 2016 at 9:36 PM, Stephen Frostwrote: > * Michael Paquier (michael.paqu...@gmail.com) wrote: >> On Tue, Mar 29, 2016 at 9:54 AM, Robert Haas wrote: >> > On Sun, Mar 27, 2016 at 5:01 AM, Fabien COELHO wrote: >> >> v40 is yet another rebase. >> > >> > Thanks. Committed after removing an unnecessary parameter from the >> > coerceToXXX() functions. >> > >> > I guess the question here is whether we want the part-c patch, which >> > removes the historical \setrandom syntax. That's technically no >> > longer needed since we now can use random() as a function directly >> > inside \set commands, but we might want it anyway for backward >> > compatibility. >> > >> > Anybody have an opinion on that? >> >> +1 for nuking it. That's not worth the trouble maintaining it. > > If we don't nuke it, it'll never die. > > See also: pg_shadow Hearing no objections, BOOM. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

I guess the question here is whether we want the part-c patch, which removes the historical \setrandom syntax. That's technically no longer needed since we now can use random() as a function directly inside \set commands, but we might want it anyway for backward compatibility. This patch is indeed a proposal. Anybody have an opinion on that? +1 for nuking it. That's not worth the trouble maintaining it. I share Michaël opinion. Some arguments for removing it: - \setrandom is syntactically inhomogeneous in the overall syntax, and is now redundant - switching to the \set syntax is pretty easy, see attached script - custom scripts are short, they are used by few but advance users, for which updating would not be an issue - the parsing & execution codes are lengthy, repetitive... -- Fabien.#! /usr/bin/perl -wp s/^\\setrandom\s+(\S+)\s+(\S+)\s+(\S+)\s+(exponential|gaussian)\s+(\S+)/\\set $1 random_$4($2, $3, $5)/; s/^\\setrandom\s+(\S+)\s+(\S+)\s+(\S+)(\s+uniform)?/\\set $1 random($2, $3)/; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

* Michael Paquier (michael.paqu...@gmail.com) wrote: > On Tue, Mar 29, 2016 at 9:54 AM, Robert Haaswrote: > > On Sun, Mar 27, 2016 at 5:01 AM, Fabien COELHO wrote: > >> v40 is yet another rebase. > > > > Thanks. Committed after removing an unnecessary parameter from the > > coerceToXXX() functions. > > > > I guess the question here is whether we want the part-c patch, which > > removes the historical \setrandom syntax. That's technically no > > longer needed since we now can use random() as a function directly > > inside \set commands, but we might want it anyway for backward > > compatibility. > > > > Anybody have an opinion on that? > > +1 for nuking it. That's not worth the trouble maintaining it. If we don't nuke it, it'll never die. See also: pg_shadow Thanks! Stephen signature.asc Description: Digital signature

### Re: [HACKERS] extend pgbench expressions with functions

On Tue, Mar 29, 2016 at 9:54 AM, Robert Haaswrote: > On Sun, Mar 27, 2016 at 5:01 AM, Fabien COELHO wrote: >> v40 is yet another rebase. > > Thanks. Committed after removing an unnecessary parameter from the > coerceToXXX() functions. > > I guess the question here is whether we want the part-c patch, which > removes the historical \setrandom syntax. That's technically no > longer needed since we now can use random() as a function directly > inside \set commands, but we might want it anyway for backward > compatibility. > > Anybody have an opinion on that? +1 for nuking it. That's not worth the trouble maintaining it. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

On Sun, Mar 27, 2016 at 5:01 AM, Fabien COELHOwrote: > v40 is yet another rebase. Thanks. Committed after removing an unnecessary parameter from the coerceToXXX() functions. I guess the question here is whether we want the part-c patch, which removes the historical \setrandom syntax. That's technically no longer needed since we now can use random() as a function directly inside \set commands, but we might want it anyway for backward compatibility. Anybody have an opinion on that? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

v40 is yet another rebase. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index c6d1454..4ceddae 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -815,9 +815,10 @@ pgbench options dbname - Sets variable varname to an integer value calculated + Sets variable varname to a value calculated from expression. The expression may contain integer constants such as 5432, + double constants such as 3.14159, references to variables :variablename, unary operators (+, -) and binary operators (+, -, *, /, @@ -830,7 +831,7 @@ pgbench options dbname Examples: \set ntellers 10 * :scale -\set aid (1021 * :aid) % (10 * :scale) + 1 +\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1 @@ -850,66 +851,35 @@ pgbench options dbname - By default, or when uniform is specified, all values in the - range are drawn with equal probability. Specifying gaussian - or exponential options modifies this behavior; each - requires a mandatory parameter which determines the precise shape of the - distribution. - + + + + \setrandom n 1 10 or \setrandom n 1 10 uniform + is equivalent to \set n random(1, 10) and uses a uniform + distribution. + + - - For a Gaussian distribution, the interval is mapped onto a standard - normal distribution (the classical bell-shaped Gaussian curve) truncated - at -parameter on the left and +parameter - on the right. - Values in the middle of the interval are more likely to be drawn. - To be precise, if PHI(x) is the cumulative distribution - function of the standard normal distribution, with mean mu - defined as (max + min) / 2.0, with - - f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) / -(2.0 * PHI(parameter) - 1.0) - - then value i between min and - max inclusive is drawn with probability: - f(i + 0.5) - f(i - 0.5). - Intuitively, the larger parameter, the more - frequently values close to the middle of the interval are drawn, and the - less frequently values close to the min and - max bounds. About 67% of values are drawn from the - middle 1.0 / parameter, that is a relative - 0.5 / parameter around the mean, and 95% in the middle - 2.0 / parameter, that is a relative - 1.0 / parameter around the mean; for instance, if - parameter is 4.0, 67% of values are drawn from the - middle quarter (1.0 / 4.0) of the interval (i.e. from - 3.0 / 8.0 to 5.0 / 8.0) and 95% from - the middle half (2.0 / 4.0) of the interval (second and - third quartiles). The minimum parameter is 2.0 for - performance of the Box-Muller transform. - + + +\setrandom n 1 10 exponential 3.0 is equivalent to +\set n random_exponential(1, 10, 3.0) and uses an +exponential distribution. + + - - For an exponential distribution, parameter - controls the distribution by truncating a quickly-decreasing - exponential distribution at parameter, and then - projecting onto integers between the bounds. - To be precise, with - -f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter)) - - Then value i between min and - max inclusive is drawn with probability: - f(x) - f(x + 1). - Intuitively, the larger parameter, the more - frequently values close to min are accessed, and the - less frequently values close to max are accessed. - The closer to 0 parameter, the flatter (more uniform) - the access distribution. - A crude approximation of the distribution is that the most frequent 1% - values in the range, close to min, are drawn - parameter% of the time. - parameter value must be strictly positive. + + +\setrandom n 1 10 gaussian 2.0 is equivalent to +\set n random_gaussian(1, 10, 2.0), and uses a gaussian +distribution. + + + + + See the documentation of these functions below for further information + about the precise shape of these distributions, depending on the value + of the parameter. @@ -990,34 +960,6 @@ f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter)) - - - As an example, the full definition of the built-in TPC-B-like - transaction is: - - -\set nbranches :scale -\set ntellers 10 * :scale -\set naccounts 10 * :scale -\setrandom aid 1 :naccounts -\setrandom bid 1 :nbranches -\setrandom tid 1 :ntellers -\setrandom delta -5000 5000 -BEGIN; -UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid; -SELECT abalance FROM pgbench_accounts WHERE aid = :aid; -UPDATE pgbench_tellers SET tbalance =

### Re: [HACKERS] extend pgbench expressions with functions

Bonjour Michaël, v39 is yet another rebase: 42 is in sight! What's up with v42? Is that your personal record? It is just a (geek) joke, see: https://en.wikipedia.org/wiki/42_%28number%29#Hitchhiker.27s_Guide_to_the_Galaxy It is the answer to the "Ultimate Question of Life, The Universe and Everything", computed by the supercomputer "Deep thought" in 7.5 million years. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

On Mon, Mar 21, 2016 at 6:56 AM, Fabien COELHOwrote: > >> v38 is a simple rebase, trying to keep up-to-date with Tom's work. > > > v39 is yet another rebase: 42 is in sight! What's up with v42? Is that your personal record? -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

v38 is a simple rebase, trying to keep up-to-date with Tom's work. v39 is yet another rebase: 42 is in sight! -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index c6d1454..4ceddae 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -815,9 +815,10 @@ pgbench options dbname - Sets variable varname to an integer value calculated + Sets variable varname to a value calculated from expression. The expression may contain integer constants such as 5432, + double constants such as 3.14159, references to variables :variablename, unary operators (+, -) and binary operators (+, -, *, /, @@ -830,7 +831,7 @@ pgbench options dbname Examples: \set ntellers 10 * :scale -\set aid (1021 * :aid) % (10 * :scale) + 1 +\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1 @@ -850,66 +851,35 @@ pgbench options dbname - By default, or when uniform is specified, all values in the - range are drawn with equal probability. Specifying gaussian - or exponential options modifies this behavior; each - requires a mandatory parameter which determines the precise shape of the - distribution. - + + + + \setrandom n 1 10 or \setrandom n 1 10 uniform + is equivalent to \set n random(1, 10) and uses a uniform + distribution. + + - - For a Gaussian distribution, the interval is mapped onto a standard - normal distribution (the classical bell-shaped Gaussian curve) truncated - at -parameter on the left and +parameter - on the right. - Values in the middle of the interval are more likely to be drawn. - To be precise, if PHI(x) is the cumulative distribution - function of the standard normal distribution, with mean mu - defined as (max + min) / 2.0, with - - f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) / -(2.0 * PHI(parameter) - 1.0) - - then value i between min and - max inclusive is drawn with probability: - f(i + 0.5) - f(i - 0.5). - Intuitively, the larger parameter, the more - frequently values close to the middle of the interval are drawn, and the - less frequently values close to the min and - max bounds. About 67% of values are drawn from the - middle 1.0 / parameter, that is a relative - 0.5 / parameter around the mean, and 95% in the middle - 2.0 / parameter, that is a relative - 1.0 / parameter around the mean; for instance, if - parameter is 4.0, 67% of values are drawn from the - middle quarter (1.0 / 4.0) of the interval (i.e. from - 3.0 / 8.0 to 5.0 / 8.0) and 95% from - the middle half (2.0 / 4.0) of the interval (second and - third quartiles). The minimum parameter is 2.0 for - performance of the Box-Muller transform. - + + +\setrandom n 1 10 exponential 3.0 is equivalent to +\set n random_exponential(1, 10, 3.0) and uses an +exponential distribution. + + - - For an exponential distribution, parameter - controls the distribution by truncating a quickly-decreasing - exponential distribution at parameter, and then - projecting onto integers between the bounds. - To be precise, with - -f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter)) - - Then value i between min and - max inclusive is drawn with probability: - f(x) - f(x + 1). - Intuitively, the larger parameter, the more - frequently values close to min are accessed, and the - less frequently values close to max are accessed. - The closer to 0 parameter, the flatter (more uniform) - the access distribution. - A crude approximation of the distribution is that the most frequent 1% - values in the range, close to min, are drawn - parameter% of the time. - parameter value must be strictly positive. + + +\setrandom n 1 10 gaussian 2.0 is equivalent to +\set n random_gaussian(1, 10, 2.0), and uses a gaussian +distribution. + + + + + See the documentation of these functions below for further information + about the precise shape of these distributions, depending on the value + of the parameter. @@ -990,34 +960,6 @@ f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter)) - - - As an example, the full definition of the built-in TPC-B-like - transaction is: - - -\set nbranches :scale -\set ntellers 10 * :scale -\set naccounts 10 * :scale -\setrandom aid 1 :naccounts -\setrandom bid 1 :nbranches -\setrandom tid 1 :ntellers -\setrandom delta -5000 5000 -BEGIN; -UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid; -SELECT

### Re: [HACKERS] extend pgbench expressions with functions

v38 is a simple rebase, trying to keep up-to-date with Tom's work. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index dd3fb1d..cf9c1cd 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -802,9 +802,10 @@ pgbench options dbname - Sets variable varname to an integer value calculated + Sets variable varname to a value calculated from expression. The expression may contain integer constants such as 5432, + double constants such as 3.14159, references to variables :variablename, and expressions composed of unary (-) or binary operators (+, -, *, /, @@ -817,7 +818,7 @@ pgbench options dbname Examples: \set ntellers 10 * :scale -\set aid (1021 * :aid) % (10 * :scale) + 1 +\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1 @@ -837,66 +838,35 @@ pgbench options dbname - By default, or when uniform is specified, all values in the - range are drawn with equal probability. Specifying gaussian - or exponential options modifies this behavior; each - requires a mandatory parameter which determines the precise shape of the - distribution. - + + + + \setrandom n 1 10 or \setrandom n 1 10 uniform + is equivalent to \set n random(1, 10) and uses a uniform + distribution. + + - - For a Gaussian distribution, the interval is mapped onto a standard - normal distribution (the classical bell-shaped Gaussian curve) truncated - at -parameter on the left and +parameter - on the right. - Values in the middle of the interval are more likely to be drawn. - To be precise, if PHI(x) is the cumulative distribution - function of the standard normal distribution, with mean mu - defined as (max + min) / 2.0, with - - f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) / -(2.0 * PHI(parameter) - 1.0) - - then value i between min and - max inclusive is drawn with probability: - f(i + 0.5) - f(i - 0.5). - Intuitively, the larger parameter, the more - frequently values close to the middle of the interval are drawn, and the - less frequently values close to the min and - max bounds. About 67% of values are drawn from the - middle 1.0 / parameter, that is a relative - 0.5 / parameter around the mean, and 95% in the middle - 2.0 / parameter, that is a relative - 1.0 / parameter around the mean; for instance, if - parameter is 4.0, 67% of values are drawn from the - middle quarter (1.0 / 4.0) of the interval (i.e. from - 3.0 / 8.0 to 5.0 / 8.0) and 95% from - the middle half (2.0 / 4.0) of the interval (second and - third quartiles). The minimum parameter is 2.0 for - performance of the Box-Muller transform. - + + +\setrandom n 1 10 exponential 3.0 is equivalent to +\set n random_exponential(1, 10, 3.0) and uses an +exponential distribution. + + - - For an exponential distribution, parameter - controls the distribution by truncating a quickly-decreasing - exponential distribution at parameter, and then - projecting onto integers between the bounds. - To be precise, with - -f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter)) - - Then value i between min and - max inclusive is drawn with probability: - f(x) - f(x + 1). - Intuitively, the larger parameter, the more - frequently values close to min are accessed, and the - less frequently values close to max are accessed. - The closer to 0 parameter, the flatter (more uniform) - the access distribution. - A crude approximation of the distribution is that the most frequent 1% - values in the range, close to min, are drawn - parameter% of the time. - parameter value must be strictly positive. + + +\setrandom n 1 10 gaussian 2.0 is equivalent to +\set n random_gaussian(1, 10, 2.0), and uses a gaussian +distribution. + + + + + See the documentation of these functions below for further information + about the precise shape of these distributions, depending on the value + of the parameter. @@ -975,34 +945,6 @@ f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter)) - - - As an example, the full definition of the built-in TPC-B-like - transaction is: - - -\set nbranches :scale -\set ntellers 10 * :scale -\set naccounts 10 * :scale -\setrandom aid 1 :naccounts -\setrandom bid 1 :nbranches -\setrandom tid 1 :ntellers -\setrandom delta -5000 5000 -BEGIN; -UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid; -SELECT abalance FROM pgbench_accounts WHERE

### Re: [HACKERS] extend pgbench expressions with functions

Here is a v36 which inspect very carefully the string to decide whether it is an int or a double. You may, or may not, find it to your taste, I can't say. Here is a v37 which is mostly a rebase after recent changes. Also I noticed that I was double counting errors in the previous version, so I fixed it. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index dd3fb1d..cf9c1cd 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -802,9 +802,10 @@ pgbench options dbname - Sets variable varname to an integer value calculated + Sets variable varname to a value calculated from expression. The expression may contain integer constants such as 5432, + double constants such as 3.14159, references to variables :variablename, and expressions composed of unary (-) or binary operators (+, -, *, /, @@ -817,7 +818,7 @@ pgbench options dbname Examples: \set ntellers 10 * :scale -\set aid (1021 * :aid) % (10 * :scale) + 1 +\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1 @@ -837,66 +838,35 @@ pgbench options dbname - By default, or when uniform is specified, all values in the - range are drawn with equal probability. Specifying gaussian - or exponential options modifies this behavior; each - requires a mandatory parameter which determines the precise shape of the - distribution. - + + + + \setrandom n 1 10 or \setrandom n 1 10 uniform + is equivalent to \set n random(1, 10) and uses a uniform + distribution. + + - - For a Gaussian distribution, the interval is mapped onto a standard - normal distribution (the classical bell-shaped Gaussian curve) truncated - at -parameter on the left and +parameter - on the right. - Values in the middle of the interval are more likely to be drawn. - To be precise, if PHI(x) is the cumulative distribution - function of the standard normal distribution, with mean mu - defined as (max + min) / 2.0, with - - f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) / -(2.0 * PHI(parameter) - 1.0) - - then value i between min and - max inclusive is drawn with probability: - f(i + 0.5) - f(i - 0.5). - Intuitively, the larger parameter, the more - frequently values close to the middle of the interval are drawn, and the - less frequently values close to the min and - max bounds. About 67% of values are drawn from the - middle 1.0 / parameter, that is a relative - 0.5 / parameter around the mean, and 95% in the middle - 2.0 / parameter, that is a relative - 1.0 / parameter around the mean; for instance, if - parameter is 4.0, 67% of values are drawn from the - middle quarter (1.0 / 4.0) of the interval (i.e. from - 3.0 / 8.0 to 5.0 / 8.0) and 95% from - the middle half (2.0 / 4.0) of the interval (second and - third quartiles). The minimum parameter is 2.0 for - performance of the Box-Muller transform. - + + +\setrandom n 1 10 exponential 3.0 is equivalent to +\set n random_exponential(1, 10, 3.0) and uses an +exponential distribution. + + - - For an exponential distribution, parameter - controls the distribution by truncating a quickly-decreasing - exponential distribution at parameter, and then - projecting onto integers between the bounds. - To be precise, with - -f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter)) - - Then value i between min and - max inclusive is drawn with probability: - f(x) - f(x + 1). - Intuitively, the larger parameter, the more - frequently values close to min are accessed, and the - less frequently values close to max are accessed. - The closer to 0 parameter, the flatter (more uniform) - the access distribution. - A crude approximation of the distribution is that the most frequent 1% - values in the range, close to min, are drawn - parameter% of the time. - parameter value must be strictly positive. + + +\setrandom n 1 10 gaussian 2.0 is equivalent to +\set n random_gaussian(1, 10, 2.0), and uses a gaussian +distribution. + + + + + See the documentation of these functions below for further information + about the precise shape of these distributions, depending on the value + of the parameter. @@ -975,34 +945,6 @@ f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter)) - - - As an example, the full definition of the built-in TPC-B-like - transaction is: - - -\set nbranches :scale -\set ntellers 10 * :scale -\set naccounts 10 * :scale

### Re: [HACKERS] extend pgbench expressions with functions

Hello Robert, [...] With your patch, you get different behavior depending on exactly how the input is malformed. I understand that you require only one possible error message on malformed input, instead of failing when converting to double if the input looked like a double (there was a '.' clue int it, but that is not proof), or something else if it was assumed to be an int. So I'm going to assume that you do not like the type guessing. I'm not going to commit it this way, and frankly, neither is anyone else. Hmmm. Probably my unconcious self is trying to reach 42. Here is a v36 which inspect very carefully the string to decide whether it is an int or a double. You may, or may not, find it to your taste, I can't say. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index cc80b3f..2133bf7 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -794,9 +794,10 @@ pgbench options dbname - Sets variable varname to an integer value calculated + Sets variable varname to a value calculated from expression. The expression may contain integer constants such as 5432, + double constants such as 3.14159, references to variables :variablename, and expressions composed of unary (-) or binary operators (+, -, *, /, @@ -809,7 +810,7 @@ pgbench options dbname Examples: \set ntellers 10 * :scale -\set aid (1021 * :aid) % (10 * :scale) + 1 +\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1 @@ -829,66 +830,35 @@ pgbench options dbname - By default, or when uniform is specified, all values in the - range are drawn with equal probability. Specifying gaussian - or exponential options modifies this behavior; each - requires a mandatory parameter which determines the precise shape of the - distribution. - + + + + \setrandom n 1 10 or \setrandom n 1 10 uniform + is equivalent to \set n random(1, 10) and uses a uniform + distribution. + + - - For a Gaussian distribution, the interval is mapped onto a standard - normal distribution (the classical bell-shaped Gaussian curve) truncated - at -parameter on the left and +parameter - on the right. - Values in the middle of the interval are more likely to be drawn. - To be precise, if PHI(x) is the cumulative distribution - function of the standard normal distribution, with mean mu - defined as (max + min) / 2.0, with - - f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) / -(2.0 * PHI(parameter) - 1.0) - - then value i between min and - max inclusive is drawn with probability: - f(i + 0.5) - f(i - 0.5). - Intuitively, the larger parameter, the more - frequently values close to the middle of the interval are drawn, and the - less frequently values close to the min and - max bounds. About 67% of values are drawn from the - middle 1.0 / parameter, that is a relative - 0.5 / parameter around the mean, and 95% in the middle - 2.0 / parameter, that is a relative - 1.0 / parameter around the mean; for instance, if - parameter is 4.0, 67% of values are drawn from the - middle quarter (1.0 / 4.0) of the interval (i.e. from - 3.0 / 8.0 to 5.0 / 8.0) and 95% from - the middle half (2.0 / 4.0) of the interval (second and - third quartiles). The minimum parameter is 2.0 for - performance of the Box-Muller transform. - + + +\setrandom n 1 10 exponential 3.0 is equivalent to +\set n random_exponential(1, 10, 3.0) and uses an +exponential distribution. + + - - For an exponential distribution, parameter - controls the distribution by truncating a quickly-decreasing - exponential distribution at parameter, and then - projecting onto integers between the bounds. - To be precise, with - -f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter)) - - Then value i between min and - max inclusive is drawn with probability: - f(x) - f(x + 1). - Intuitively, the larger parameter, the more - frequently values close to min are accessed, and the - less frequently values close to max are accessed. - The closer to 0 parameter, the flatter (more uniform) - the access distribution. - A crude approximation of the distribution is that the most frequent 1% - values in the range, close to min, are drawn - parameter% of the time. - parameter value must be strictly positive. + + +\setrandom n 1 10 gaussian 2.0 is equivalent to +\set n random_gaussian(1, 10, 2.0), and uses a gaussian +distribution. + + + + + See the documentation of these functions below for

### Re: [HACKERS] extend pgbench expressions with functions

On Wed, Mar 9, 2016 at 3:09 AM, Fabien COELHOwrote: > I'm not sure what is "not acceptable" as it "totally breaks the error > handling" in the above code. > > I assumed that you want to check that sscanf can read what sprintf generated > when handling "\set". I'd guess that libc would be broken if it was the > case. I've made pgbench check that it is not. If the variable contains something that is not an integer, the existing code will end up here: fprintf(stderr, "invalid input syntax for integer: \"%s\"\n", str); With your patch, you get different behavior depending on exactly how the input is malformed. I have a strong suspicion you're going to tell me that this is another one of those cases where it's OK to make the error handling worse than it is today, but I'm tired of arguing that point. I'm not going to commit it this way, and frankly, neither is anyone else. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

Hello Robert, Here is a v35 b & c. This is not acceptable: + /* guess double type (n for "inf", "-inf" and "nan") */ + if (strchr(var, '.') != NULL || strchr(var, 'n') != NULL) + { + double dv; + sscanf(var, "%lf", ); + setDoubleValue(retval, dv); + } + else + setIntValue(retval, strtoint64(var)); That totally breaks the error handling which someone carefully established here. I'm not sure what is "not acceptable" as it "totally breaks the error handling" in the above code. I assumed that you want to check that sscanf can read what sprintf generated when handling "\set". I'd guess that libc would be broken if it was the case. I've made pgbench check that it is not. If it was something else, you have to spell it out for me. Extra space. [...] Another extra space. Indeed. - int nargs = 0; - int64 iargs[MAX_FARGS]; - PgBenchExprLink *l = args; + int nargs = 0; + PgBenchValuevargs[MAX_FARGS]; + PgBenchExprLink*l = args; Completely unnecessary reindentation of the first and third lines. It just looked better to me. + setIntValue(retval, i < 0? -i: i); Not project style. Indeed. Please fix the whitespace damage git diff --check complains about, "git diff -check" does not seem to complain on the full v35-b. and check for other places where you haven't followed project style. I've found some more instances of "& ref". -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index cc80b3f..2133bf7 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -794,9 +794,10 @@ pgbench options dbname - Sets variable varname to an integer value calculated + Sets variable varname to a value calculated from expression. The expression may contain integer constants such as 5432, + double constants such as 3.14159, references to variables :variablename, and expressions composed of unary (-) or binary operators (+, -, *, /, @@ -809,7 +810,7 @@ pgbench options dbname Examples: \set ntellers 10 * :scale -\set aid (1021 * :aid) % (10 * :scale) + 1 +\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1 @@ -829,66 +830,35 @@ pgbench options dbname - By default, or when uniform is specified, all values in the - range are drawn with equal probability. Specifying gaussian - or exponential options modifies this behavior; each - requires a mandatory parameter which determines the precise shape of the - distribution. - + + + + \setrandom n 1 10 or \setrandom n 1 10 uniform + is equivalent to \set n random(1, 10) and uses a uniform + distribution. + + - - For a Gaussian distribution, the interval is mapped onto a standard - normal distribution (the classical bell-shaped Gaussian curve) truncated - at -parameter on the left and +parameter - on the right. - Values in the middle of the interval are more likely to be drawn. - To be precise, if PHI(x) is the cumulative distribution - function of the standard normal distribution, with mean mu - defined as (max + min) / 2.0, with - - f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) / -(2.0 * PHI(parameter) - 1.0) - - then value i between min and - max inclusive is drawn with probability: - f(i + 0.5) - f(i - 0.5). - Intuitively, the larger parameter, the more - frequently values close to the middle of the interval are drawn, and the - less frequently values close to the min and - max bounds. About 67% of values are drawn from the - middle 1.0 / parameter, that is a relative - 0.5 / parameter around the mean, and 95% in the middle - 2.0 / parameter, that is a relative - 1.0 / parameter around the mean; for instance, if - parameter is 4.0, 67% of values are drawn from the - middle quarter (1.0 / 4.0) of the interval (i.e. from - 3.0 / 8.0 to 5.0 / 8.0) and 95% from - the middle half (2.0 / 4.0) of the interval (second and - third quartiles). The minimum parameter is 2.0 for - performance of the Box-Muller transform. - + + +\setrandom n 1 10 exponential 3.0 is equivalent to +\set n random_exponential(1, 10, 3.0) and uses an +exponential distribution. + + - - For an exponential distribution, parameter - controls the distribution

### Re: [HACKERS] extend pgbench expressions with functions

On Tue, Mar 8, 2016 at 3:52 PM, Fabien COELHOwrote: > [ new patch and assorted color commentary ] This is not acceptable: + /* guess double type (n for "inf", "-inf" and "nan") */ + if (strchr(var, '.') != NULL || strchr(var, 'n') != NULL) + { + double dv; + sscanf(var, "%lf", ); + setDoubleValue(retval, dv); + } + else + setIntValue(retval, strtoint64(var)); That totally breaks the error handling which someone carefully established here. + PgBenchValue *varg = & vargs[0]; Extra space. + if (!coerceToDouble(st, & vargs[0], )) + return false; Another extra space. - int nargs = 0; - int64 iargs[MAX_FARGS]; - PgBenchExprLink *l = args; + int nargs = 0; + PgBenchValuevargs[MAX_FARGS]; + PgBenchExprLink*l = args; Completely unnecessary reindentation of the first and third lines. + setIntValue(retval, i < 0? -i: i); Not project style. Please fix the whitespace damage git diff --check complains about, and check for other places where you haven't followed project style. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

Hello Robert. Here is a v34 b & c. // comments are not allowed. I'd just remove the two you have. Back to the eighties! It make no sense to exit(1) and then return 0, so don't do that. I might write this code as: This would get rid of the internal-error case here altogether in favor of testing it via an assertion. I've put assertions instead of exit in some places. I think that coerceToInt() should not exit(1) when an overflow occurs; I think that it should, because the only sane option for the user is to fix the script and relaunch the bench: counting errors has no added value for the user. The attached version does some error handling instead, too bad. Now, if rval is out of range of an integer, that is going to overflow while trying to see whether it should divide by zero. I could not find a place where there where such potential issue. If the value is zero, it cannot overflow when cast to int. If it is not zero but it overflows, then it is an overflow, so it should overflow. Maybe I misunderstood your point. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index cc80b3f..2133bf7 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -794,9 +794,10 @@ pgbench options dbname - Sets variable varname to an integer value calculated + Sets variable varname to a value calculated from expression. The expression may contain integer constants such as 5432, + double constants such as 3.14159, references to variables :variablename, and expressions composed of unary (-) or binary operators (+, -, *, /, @@ -809,7 +810,7 @@ pgbench options dbname Examples: \set ntellers 10 * :scale -\set aid (1021 * :aid) % (10 * :scale) + 1 +\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1 @@ -829,66 +830,35 @@ pgbench options dbname - By default, or when uniform is specified, all values in the - range are drawn with equal probability. Specifying gaussian - or exponential options modifies this behavior; each - requires a mandatory parameter which determines the precise shape of the - distribution. - + + + + \setrandom n 1 10 or \setrandom n 1 10 uniform + is equivalent to \set n random(1, 10) and uses a uniform + distribution. + + - - For a Gaussian distribution, the interval is mapped onto a standard - normal distribution (the classical bell-shaped Gaussian curve) truncated - at -parameter on the left and +parameter - on the right. - Values in the middle of the interval are more likely to be drawn. - To be precise, if PHI(x) is the cumulative distribution - function of the standard normal distribution, with mean mu - defined as (max + min) / 2.0, with - - f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) / -(2.0 * PHI(parameter) - 1.0) - - then value i between min and - max inclusive is drawn with probability: - f(i + 0.5) - f(i - 0.5). - Intuitively, the larger parameter, the more - frequently values close to the middle of the interval are drawn, and the - less frequently values close to the min and - max bounds. About 67% of values are drawn from the - middle 1.0 / parameter, that is a relative - 0.5 / parameter around the mean, and 95% in the middle - 2.0 / parameter, that is a relative - 1.0 / parameter around the mean; for instance, if - parameter is 4.0, 67% of values are drawn from the - middle quarter (1.0 / 4.0) of the interval (i.e. from - 3.0 / 8.0 to 5.0 / 8.0) and 95% from - the middle half (2.0 / 4.0) of the interval (second and - third quartiles). The minimum parameter is 2.0 for - performance of the Box-Muller transform. - + + +\setrandom n 1 10 exponential 3.0 is equivalent to +\set n random_exponential(1, 10, 3.0) and uses an +exponential distribution. + + - - For an exponential distribution, parameter - controls the distribution by truncating a quickly-decreasing - exponential distribution at parameter, and then - projecting onto integers between the bounds. - To be precise, with - -f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter)) - - Then value i between min and - max inclusive is drawn with probability: - f(x) - f(x + 1). - Intuitively, the larger parameter, the more - frequently values close to min are accessed, and the - less frequently values close to max are accessed. - The closer to 0 parameter, the flatter (more uniform) - the access distribution. - A crude approximation of the distribution is that the most frequent 1% - values in the range, close to min, are drawn - parameter%

### Re: [HACKERS] extend pgbench expressions with functions

On Tue, Mar 8, 2016 at 11:48 AM, Fabien COELHOwrote: > If this is a blocker I'll do them, but I'm convince it is not what should be > done. Well, I think it's a blocker. Exiting within deeply-nested code instead of propagating the error upward does not strike me as a good practice. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

Hello Robert, Having a look at 33-b, this looks pretty good now, but: // comments are not allowed. I'd just remove the two you have. Oops, C89 did not make it everywhere yet! It make no sense to exit(1) and then return 0, so don't do that. [...] This would get rid of the internal-error case here altogether in favor of testing it via an assertion. Ok. Fine with me. I think that coerceToInt() should not exit(1) when an overflow occurs; instead, I think the function should be declared as bool coerceToInt(PgBenchValue *pval, int64 *result), and the error return should be propagated back to the caller, which can then return false as we do for other error cases. This is the pain (ugly repeated code) that I wish to avoid, because then we cannot write a simple addition for doing an addition, but have to do several ugly tests instead. Ok, elegance is probably not a sufficient argument against doing that. Moreover, I do not see any condition in which doing what you suggest makes much sense from the user perspective: if there is an internal error in the bench code it seems much more logical to ask for the user for a sensible bench script, because I would not know how to interpret tps on a bench with internal failures in the client code anyway. For me, exiting immediatly on internal script errors is the right action. If this is a blocker I'll do them, but I'm convince it is not what should be done. I think that some of the places you've used coerceToInt() are not appropriate. Like, for example: + if (coerceToInt(rval) == 0) + fprintf(stderr, "division by zero\n"); + return false; + } Now, if rval is out of range of an integer, that is going to overflow while trying to see whether it should divide by zero. Please work a little harder here and in similar cases. Ok. Maybe add a helper function checkIntegerEquality(PgBenchValue *, int64). Maybe. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

On Mon, Mar 7, 2016 at 4:58 AM, Fabien COELHOwrote: >> - 32-b: add double functions, including double variables >> - 32-c: remove \setrandom support (advice to use \set + random instead) > > Here is a rebased version after Tom's updates, 33-b & 33-c. I also extended > the floating point syntax to signed accept signed exponents, and changed the > regexpr style to match Toms changes. Having a look at 33-b, this looks pretty good now, but: // comments are not allowed. I'd just remove the two you have. It make no sense to exit(1) and then return 0, so don't do that. I might write this code as: if (pval->type == PGBT_INT) return pval->u.ival; Assert(pval->type == PGBT_DOUBLE); /* do double stuff */ This would get rid of the internal-error case here altogether in favor of testing it via an assertion. I think that coerceToInt() should not exit(1) when an overflow occurs; instead, I think the function should be declared as bool coerceToInt(PgBenchValue *pval, int64 *result), and the error return should be propagated back to the caller, which can then return false as we do for other error cases. I think that some of the places you've used coerceToInt() are not appropriate. Like, for example: + if (coerceToInt(rval) == 0) + { + fprintf(stderr, "division by zero\n"); + return false; + } Now, if rval is out of range of an integer, that is going to overflow while trying to see whether it should divide by zero. Please work a little harder here and in similar cases. Maybe add a helper function checkIntegerEquality(PgBenchValue *, int64). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

- 32-b: add double functions, including double variables - 32-c: remove \setrandom support (advice to use \set + random instead) Here is a rebased version after Tom's updates, 33-b & 33-c. I also extended the floating point syntax to signed accept signed exponents, and changed the regexpr style to match Toms changes. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index cc80b3f..2133bf7 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -794,9 +794,10 @@ pgbench options dbname - Sets variable varname to an integer value calculated + Sets variable varname to a value calculated from expression. The expression may contain integer constants such as 5432, + double constants such as 3.14159, references to variables :variablename, and expressions composed of unary (-) or binary operators (+, -, *, /, @@ -809,7 +810,7 @@ pgbench options dbname Examples: \set ntellers 10 * :scale -\set aid (1021 * :aid) % (10 * :scale) + 1 +\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1 @@ -829,66 +830,35 @@ pgbench options dbname - By default, or when uniform is specified, all values in the - range are drawn with equal probability. Specifying gaussian - or exponential options modifies this behavior; each - requires a mandatory parameter which determines the precise shape of the - distribution. - + + + + \setrandom n 1 10 or \setrandom n 1 10 uniform + is equivalent to \set n random(1, 10) and uses a uniform + distribution. + + - - For a Gaussian distribution, the interval is mapped onto a standard - normal distribution (the classical bell-shaped Gaussian curve) truncated - at -parameter on the left and +parameter - on the right. - Values in the middle of the interval are more likely to be drawn. - To be precise, if PHI(x) is the cumulative distribution - function of the standard normal distribution, with mean mu - defined as (max + min) / 2.0, with - - f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) / -(2.0 * PHI(parameter) - 1.0) - - then value i between min and - max inclusive is drawn with probability: - f(i + 0.5) - f(i - 0.5). - Intuitively, the larger parameter, the more - frequently values close to the middle of the interval are drawn, and the - less frequently values close to the min and - max bounds. About 67% of values are drawn from the - middle 1.0 / parameter, that is a relative - 0.5 / parameter around the mean, and 95% in the middle - 2.0 / parameter, that is a relative - 1.0 / parameter around the mean; for instance, if - parameter is 4.0, 67% of values are drawn from the - middle quarter (1.0 / 4.0) of the interval (i.e. from - 3.0 / 8.0 to 5.0 / 8.0) and 95% from - the middle half (2.0 / 4.0) of the interval (second and - third quartiles). The minimum parameter is 2.0 for - performance of the Box-Muller transform. - + + +\setrandom n 1 10 exponential 3.0 is equivalent to +\set n random_exponential(1, 10, 3.0) and uses an +exponential distribution. + + - - For an exponential distribution, parameter - controls the distribution by truncating a quickly-decreasing - exponential distribution at parameter, and then - projecting onto integers between the bounds. - To be precise, with - -f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter)) - - Then value i between min and - max inclusive is drawn with probability: - f(x) - f(x + 1). - Intuitively, the larger parameter, the more - frequently values close to min are accessed, and the - less frequently values close to max are accessed. - The closer to 0 parameter, the flatter (more uniform) - the access distribution. - A crude approximation of the distribution is that the most frequent 1% - values in the range, close to min, are drawn - parameter% of the time. - parameter value must be strictly positive. + + +\setrandom n 1 10 gaussian 2.0 is equivalent to +\set n random_gaussian(1, 10, 2.0), and uses a gaussian +distribution. + + + + + See the documentation of these functions below for further information + about the precise shape of these distributions, depending on the value + of the parameter. @@ -967,34 +937,6 @@ f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter)) - - - As an example, the full definition of the built-in TPC-B-like - transaction is: - - -\set nbranches :scale -\set ntellers 10 * :scale -\set naccounts

### Re: [HACKERS] extend pgbench expressions with functions

Attached is the fixed patch for the array method. Committed with a few tweaks, including running pgindent over some of it. Thanks. So the first set of functions is in, and the operators are executed as functions as well. Fabien, are you planning to send rebased versions of the rest? By that I mean the switch of the existing subcommands into equivalent functions and the handling of double values as parameters for those functions. Here they are: - 32-b: add double functions, including double variables - 32-c: remove \setrandom support (advice to use \set + random instead) -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index cc80b3f..2133bf7 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -794,9 +794,10 @@ pgbench options dbname - Sets variable varname to an integer value calculated + Sets variable varname to a value calculated from expression. The expression may contain integer constants such as 5432, + double constants such as 3.14159, references to variables :variablename, and expressions composed of unary (-) or binary operators (+, -, *, /, @@ -809,7 +810,7 @@ pgbench options dbname Examples: \set ntellers 10 * :scale -\set aid (1021 * :aid) % (10 * :scale) + 1 +\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1 @@ -829,66 +830,35 @@ pgbench options dbname - By default, or when uniform is specified, all values in the - range are drawn with equal probability. Specifying gaussian - or exponential options modifies this behavior; each - requires a mandatory parameter which determines the precise shape of the - distribution. - + + + + \setrandom n 1 10 or \setrandom n 1 10 uniform + is equivalent to \set n random(1, 10) and uses a uniform + distribution. + + - - For a Gaussian distribution, the interval is mapped onto a standard - normal distribution (the classical bell-shaped Gaussian curve) truncated - at -parameter on the left and +parameter - on the right. - Values in the middle of the interval are more likely to be drawn. - To be precise, if PHI(x) is the cumulative distribution - function of the standard normal distribution, with mean mu - defined as (max + min) / 2.0, with - - f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) / -(2.0 * PHI(parameter) - 1.0) - - then value i between min and - max inclusive is drawn with probability: - f(i + 0.5) - f(i - 0.5). - Intuitively, the larger parameter, the more - frequently values close to the middle of the interval are drawn, and the - less frequently values close to the min and - max bounds. About 67% of values are drawn from the - middle 1.0 / parameter, that is a relative - 0.5 / parameter around the mean, and 95% in the middle - 2.0 / parameter, that is a relative - 1.0 / parameter around the mean; for instance, if - parameter is 4.0, 67% of values are drawn from the - middle quarter (1.0 / 4.0) of the interval (i.e. from - 3.0 / 8.0 to 5.0 / 8.0) and 95% from - the middle half (2.0 / 4.0) of the interval (second and - third quartiles). The minimum parameter is 2.0 for - performance of the Box-Muller transform. - + + +\setrandom n 1 10 exponential 3.0 is equivalent to +\set n random_exponential(1, 10, 3.0) and uses an +exponential distribution. + + - - For an exponential distribution, parameter - controls the distribution by truncating a quickly-decreasing - exponential distribution at parameter, and then - projecting onto integers between the bounds. - To be precise, with - -f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter)) - - Then value i between min and - max inclusive is drawn with probability: - f(x) - f(x + 1). - Intuitively, the larger parameter, the more - frequently values close to min are accessed, and the - less frequently values close to max are accessed. - The closer to 0 parameter, the flatter (more uniform) - the access distribution. - A crude approximation of the distribution is that the most frequent 1% - values in the range, close to min, are drawn - parameter% of the time. - parameter value must be strictly positive. + + +\setrandom n 1 10 gaussian 2.0 is equivalent to +\set n random_gaussian(1, 10, 2.0), and uses a gaussian +distribution. + + + + + See the documentation of these functions below for further information + about the precise shape of these distributions, depending on the value + of the parameter. @@ -967,34 +937,6

### Re: [HACKERS] extend pgbench expressions with functions

Committed with a few tweaks, including running pgindent over some of it. Thanks! -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

On Wed, Mar 2, 2016 at 3:10 AM, Robert Haaswrote: > On Wed, Feb 17, 2016 at 3:22 AM, Fabien COELHO wrote: >> Indeed. My gcc 4.8.4 with --Wall does not show the warning, too bad. >> >> Attached is the fixed patch for the array method. > > Committed with a few tweaks, including running pgindent over some of it. Thanks. So the first set of functions is in, and the operators are executed as functions as well. Fabien, are you planning to send rebased versions of the rest? By that I mean the switch of the existing subcommands into equivalent functions and the handling of double values as parameters for those functions. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

On Wed, Feb 17, 2016 at 3:22 AM, Fabien COELHOwrote: > Indeed. My gcc 4.8.4 with --Wall does not show the warning, too bad. > > Attached is the fixed patch for the array method. Committed with a few tweaks, including running pgindent over some of it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

On Wed, Feb 17, 2016 at 5:22 PM, Fabien COELHOwrote: >> \set aid 1 + 1 >> pgbench -f addition.sql -t 5000 >> >> I have the following: >> HEAD: 3.5~3.7M TPS >> list method: 3.6~3.7M TPS >> array method: 3.4~3.5M TPS >> So all approaches have a comparable performance. > > Yep, the execution trace is pretty similar in all cases, maybe with a little > more work for the array method, although I'm surprise that the difference is > discernable. Nah, that's mostly noise I think. My conclusion is that all approaches are rather equivalent for the operators. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

Hello Michaël, \set aid 1 + 1 pgbench -f addition.sql -t 5000 I have the following: HEAD: 3.5~3.7M TPS list method: 3.6~3.7M TPS array method: 3.4~3.5M TPS So all approaches have a comparable performance. Yep, the execution trace is pretty similar in all cases, maybe with a little more work for the array method, although I'm surprise that the difference is discernable. Btw, patch 2 is returning a warning for me: It is trying to compare a 32b integer with an int64 value, evalFunc needed an int64. Indeed. My gcc 4.8.4 with --Wall does not show the warning, too bad. Attached is the fixed patch for the array method. -- Fabiendiff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index ade1b53..f39f341 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -786,7 +786,7 @@ pgbench options dbname - + \set varname expression @@ -798,8 +798,10 @@ pgbench options dbname The expression may contain integer constants such as 5432, references to variables :variablename, and expressions composed of unary (-) or binary operators - (+, -, *, /, %) - with their usual associativity, and parentheses. + (+, -, *, /, + %) with their usual associativity, + function calls, and + parentheses. @@ -994,6 +996,62 @@ END; + + Built-In Functions + + + The following functions are built into pgbench and + may be used in conjunction with + \set. + + + + +pgbench Functions + + + + Function + Return Type + Description + Example + Result + + + + + abs(a) + same as a + integer value + abs(-17) + 17 + + + debug(a) + same as a + print to stderr the given argument + debug(5432) + 5432 + + + max(i [, ... ] ) + integer + maximum value + max(5, 4, 3, 2) + 5 + + + min(i [, ... ] ) + integer + minimum value + min(5, 4, 3, 2) + 2 + + + + + + Per-Transaction Logging diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y index 06ee04b..cac4d5e 100644 --- a/src/bin/pgbench/exprparse.y +++ b/src/bin/pgbench/exprparse.y @@ -16,10 +16,13 @@ PgBenchExpr *expr_parse_result; +static PgBenchExprList *make_elist(PgBenchExpr *exp, PgBenchExprList *list); static PgBenchExpr *make_integer_constant(int64 ival); static PgBenchExpr *make_variable(char *varname); -static PgBenchExpr *make_op(char operator, PgBenchExpr *lexpr, +static PgBenchExpr *make_op(const char *operator, PgBenchExpr *lexpr, PgBenchExpr *rexpr); +static int find_func(const char *fname); +static PgBenchExpr *make_func(const int fnumber, PgBenchExprList *args); %} @@ -31,13 +34,15 @@ static PgBenchExpr *make_op(char operator, PgBenchExpr *lexpr, int64 ival; char *str; PgBenchExpr *expr; + PgBenchExprList *elist; } +%type elist %type expr -%type INTEGER -%type VARIABLE +%type INTEGER function +%type VARIABLE FUNCTION -%token INTEGER VARIABLE +%token INTEGER VARIABLE FUNCTION %token CHAR_ERROR /* never used, will raise a syntax error */ /* Precedence: lowest to highest */ @@ -49,16 +54,25 @@ static PgBenchExpr *make_op(char operator, PgBenchExpr *lexpr, result: expr{ expr_parse_result = $1; } +elist: { $$ = NULL; } + | expr { $$ = make_elist($1, NULL); } + | elist ',' expr { $$ = make_elist($3, $1); } + ; + expr: '(' expr ')' { $$ = $2; } | '+' expr %prec UMINUS { $$ = $2; } - | '-' expr %prec UMINUS { $$ = make_op('-', make_integer_constant(0), $2); } - | expr '+' expr { $$ = make_op('+', $1, $3); } - | expr '-' expr { $$ = make_op('-', $1, $3); } - | expr '*' expr { $$ = make_op('*', $1, $3); } - | expr '/' expr { $$ = make_op('/', $1, $3); } - | expr '%' expr { $$ = make_op('%', $1, $3); } + | '-' expr %prec UMINUS { $$ = make_op("-", make_integer_constant(0), $2); } + | expr '+' expr { $$ = make_op("+", $1, $3); } + | expr '-' expr { $$ = make_op("-", $1, $3); } + | expr '*' expr { $$ = make_op("*", $1, $3); } + | expr '/' expr { $$ = make_op("/", $1, $3); } + | expr '%' expr { $$ = make_op("%", $1, $3); } | INTEGER{ $$ = make_integer_constant($1); } | VARIABLE { $$ = make_variable($1); } + | function '(' elist ')'{ $$ = make_func($1, $3); } + ; + +function: FUNCTION { $$ = find_func($1); pg_free($1); } ; %% @@ -84,14 +98,131 @@ make_variable(char *varname) } static PgBenchExpr * -make_op(char operator, PgBenchExpr *lexpr, PgBenchExpr *rexpr) +make_op(const char *operator, PgBenchExpr *lexpr, PgBenchExpr *rexpr) +{ + return make_func(find_func(operator), + make_elist(rexpr, make_elist(lexpr, NULL))); +} + +/* + * List of available functions: + * - fname: function name + * - nargs: number of arguments + *

### Re: [HACKERS] extend pgbench expressions with functions

On Tue, Feb 16, 2016 at 11:12 PM, Fabien COELHOwrote: > > Hello Robert, > >>> However, for obvious reasons the committer opinion prevails:-) >> >> >> You're welcome to solicit other opinions. I'm not unwilling to give >> way if there's a chorus of support for the way you've done it. > > > Hmmm. I do not expect much chorus on such a minor subject:-) > >> But to me it sounds like you're saying that it doesn't really matter >> whether the system is scalable and maintainable because we only have 5 >> functions right now, which is a design philosophy with which I just don't >> agree. > > > The design does not aim at scalability but at simplicity, otherwise I would > have done things quite differently: with many functions the "switch" based > selection does not scale anyway. > > Anyway, attached are two patches, one for each approach. > > The array (second patch) is not too bad if one agrees with a maximum number > of arguments, and also as I did not change the list structure coming from > the parser, so it does not need to manage the number of arguments in the > function structure. The code for min/max is also simpler when dealing with > an array instead of a linked list. I do not like much array references in > the code, so I tried to avoid them by using lval/rval scalars for operator > operands. > > Please choose the one you prefer, and I'll adapt the remaining stuff to your > choice. For those two patches and HEAD, it is possible to do a direct performance comparison for simple operator functions, as those are being switched to behave as functions. So doing the following for 50M "transactions" on my laptop: \set aid 1 + 1 pgbench -f addition.sql -t 5000 I have the following: HEAD: 3.5~3.7M TPS list method: 3.6~3.7M TPS array method: 3.4~3.5M TPS So all approaches have a comparable performance. Btw, patch 2 is returning a warning for me: pgbench.c:1060:16: warning: comparison of constant -9223372036854775808 with expression of type 'int' is always false [-Wtautological-constant-out-of-range-compare] if (lval == PG_INT64_MIN) It is trying to compare a 32b integer with an int64 value, evalFunc needed an int64. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

Hello Robert, However, for obvious reasons the committer opinion prevails:-) You're welcome to solicit other opinions. I'm not unwilling to give way if there's a chorus of support for the way you've done it. Hmmm. I do not expect much chorus on such a minor subject:-) But to me it sounds like you're saying that it doesn't really matter whether the system is scalable and maintainable because we only have 5 functions right now, which is a design philosophy with which I just don't agree. The design does not aim at scalability but at simplicity, otherwise I would have done things quite differently: with many functions the "switch" based selection does not scale anyway. Anyway, attached are two patches, one for each approach. The array (second patch) is not too bad if one agrees with a maximum number of arguments, and also as I did not change the list structure coming from the parser, so it does not need to manage the number of arguments in the function structure. The code for min/max is also simpler when dealing with an array instead of a linked list. I do not like much array references in the code, so I tried to avoid them by using lval/rval scalars for operator operands. Please choose the one you prefer, and I'll adapt the remaining stuff to your choice. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index ade1b53..f39f341 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -786,7 +786,7 @@ pgbench options dbname - + \set varname expression @@ -798,8 +798,10 @@ pgbench options dbname The expression may contain integer constants such as 5432, references to variables :variablename, and expressions composed of unary (-) or binary operators - (+, -, *, /, %) - with their usual associativity, and parentheses. + (+, -, *, /, + %) with their usual associativity, + function calls, and + parentheses. @@ -994,6 +996,62 @@ END; + + Built-In Functions + + + The following functions are built into pgbench and + may be used in conjunction with + \set. + + + + +pgbench Functions + + + + Function + Return Type + Description + Example + Result + + + + + abs(a) + same as a + integer value + abs(-17) + 17 + + + debug(a) + same as a + print to stderr the given argument + debug(5432) + 5432 + + + max(i [, ... ] ) + integer + maximum value + max(5, 4, 3, 2) + 5 + + + min(i [, ... ] ) + integer + minimum value + min(5, 4, 3, 2) + 2 + + + + + + Per-Transaction Logging diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y index 06ee04b..cac4d5e 100644 --- a/src/bin/pgbench/exprparse.y +++ b/src/bin/pgbench/exprparse.y @@ -16,10 +16,13 @@ PgBenchExpr *expr_parse_result; +static PgBenchExprList *make_elist(PgBenchExpr *exp, PgBenchExprList *list); static PgBenchExpr *make_integer_constant(int64 ival); static PgBenchExpr *make_variable(char *varname); -static PgBenchExpr *make_op(char operator, PgBenchExpr *lexpr, +static PgBenchExpr *make_op(const char *operator, PgBenchExpr *lexpr, PgBenchExpr *rexpr); +static int find_func(const char *fname); +static PgBenchExpr *make_func(const int fnumber, PgBenchExprList *args); %} @@ -31,13 +34,15 @@ static PgBenchExpr *make_op(char operator, PgBenchExpr *lexpr, int64 ival; char *str; PgBenchExpr *expr; + PgBenchExprList *elist; } +%type elist %type expr -%type INTEGER -%type VARIABLE +%type INTEGER function +%type VARIABLE FUNCTION -%token INTEGER VARIABLE +%token INTEGER VARIABLE FUNCTION %token CHAR_ERROR /* never used, will raise a syntax error */ /* Precedence: lowest to highest */ @@ -49,16 +54,25 @@ static PgBenchExpr *make_op(char operator, PgBenchExpr *lexpr, result: expr{ expr_parse_result = $1; } +elist: { $$ = NULL; } + | expr { $$ = make_elist($1, NULL); } + | elist ',' expr { $$ = make_elist($3, $1); } + ; + expr: '(' expr ')' { $$ = $2; } | '+' expr %prec UMINUS { $$ = $2; } - | '-' expr %prec UMINUS { $$ = make_op('-', make_integer_constant(0), $2); } - | expr '+' expr { $$ = make_op('+', $1, $3); } - | expr '-' expr { $$ = make_op('-', $1, $3); } - | expr '*' expr { $$ = make_op('*', $1, $3); } - | expr '/' expr { $$ = make_op('/', $1, $3); } - | expr '%' expr { $$ = make_op('%', $1, $3); } + | '-' expr %prec UMINUS { $$ = make_op("-", make_integer_constant(0), $2); } + | expr '+' expr { $$ = make_op("+", $1, $3); } + | expr '-' expr { $$ = make_op("-", $1, $3); } + | expr '*' expr { $$ = make_op("*", $1, $3); } + | expr '/' expr { $$ = make_op("/", $1, $3); } + | expr '%' expr {

### Re: [HACKERS] extend pgbench expressions with functions

On Tue, Feb 16, 2016 at 5:18 AM, Fabien COELHOwrote: >>> Good point. One simple idea here would be to use a custom pgbench >>> script that has no SQL commands and just calculates the values of some >>> parameters to measure the impact without depending on the backend, >>> with a fixed number of transactions. >> >> Sure, we could do that. But whether it materially changes pgbench -S >> results, say, is a lot more important. > > > Indeed. Several runs on my laptop: > > ~ 40-54 tps with master using: > \set naccounts 10 * :scale > \setrandom aid 1 :naccounts > > ~ 43-53 tps with full function patch using: > \set naccounts 10 * :scale > \setrandom aid 1 :naccounts > > ~ 73-89 tps with full function patch using: > \set aid random(1, 10 * :scale) > > The performance is pretty similar on the same script. The real pain is > variable management, avoiding some is a win. Wow, that's pretty nice. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

On Tue, Feb 16, 2016 at 3:53 AM, Fabien COELHOwrote: > My opinion is that the submitted version is simple and fine for the purpose, > and the plan you suggest replaces 5*2 repetitions by a lot of code and > complexity, which is not desirable and should be avoided. > > However, for obvious reasons the committer opinion prevails:-) You're welcome to solicit other opinions. I'm not unwilling to give way if there's a chorus of support for the way you've done it. But to me it sounds like you're saying that it doesn't really matter whether the system is scalable and maintainable because we only have 5 functions right now, which is a design philosophy with which I just don't agree. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

Hello Robert, Good point. One simple idea here would be to use a custom pgbench script that has no SQL commands and just calculates the values of some parameters to measure the impact without depending on the backend, with a fixed number of transactions. Sure, we could do that. But whether it materially changes pgbench -S results, say, is a lot more important. Indeed. Several runs on my laptop: ~ 40-54 tps with master using: \set naccounts 10 * :scale \setrandom aid 1 :naccounts ~ 43-53 tps with full function patch using: \set naccounts 10 * :scale \setrandom aid 1 :naccounts ~ 73-89 tps with full function patch using: \set aid random(1, 10 * :scale) The performance is pretty similar on the same script. The real pain is variable management, avoiding some is a win. However, as you suggest, the tps impact even with -M prepared -S is nought, because the internal scripting time in pgbench is much smaller than the time to do actual connecting and querying. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

Hello Robert, [...] But we can't have things that are logically part of patch #2 just tossed in with patch #1. So you want integer functions without type in patch #1. I was in the middle of ripping that out of the patch when I realized that evalFunc() is pretty badly designed. Probably, we are just at v30:-) What you've done is made it the job of each particular function to evaluate its arguments. Yep. I did that for the multiple issues you raise below, and some you did not yet foresee: handling a variable number of arguments (0, 1, 2, 3, n), avoiding dynamic structures or arbitrary limitations, checking for a valid number of arguments, and also the fact that the evaluation call was not too horrible (2 lines per evaluation, factored out by groups of functions [operators, min/max, randoms, ...], it is not fully repeated). There are 5 sub-expression evaluation in the function, totalling 10 LOCs. TEN. I don't think that's a good plan. The one you suggest does not strike me as intrinsically better: it is a trade between some code ugliness (5 repeated evals = 10 lines, a little more with the double functions, probably 20 lines) to other uglinesses (number of args limit or dynamic allocation, array length to manage and test somehow, list to array conversion code, things that will mean far more than the few lines of repeated code under discussion). So I think that it is just a choice between two plans, really, the better of which is debatable. I experimented with trying to do this and ran into a problem: Yep. There are other problems, all of which solvable obviously, but which means that a lot more that 10 lines will be added to avoid the 5*2 lines repetition. My opinion is that the submitted version is simple and fine for the purpose, and the plan you suggest replaces 5*2 repetitions by a lot of code and complexity, which is not desirable and should be avoided. However, for obvious reasons the committer opinion prevails:-) After considering the various points I raised above, could you confirm that you do still require the implementation of this array approach before I spend time doing such a thing? I also went over your documentation changes. Thanks, this looks better. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

On Tue, Feb 16, 2016 at 1:55 AM, Michael Paquierwrote: > On Tue, Feb 16, 2016 at 9:18 AM, Robert Haas wrote: >> I experimented with trying to do this and ran into a problem: where >> exactly would you store the evaluated arguments when you don't know >> how many of them there will be? And even if you did know how many of >> them there will be, wouldn't that mean that evalFunc or evaluateExpr >> would have to palloc a buffer of the correct size for each invocation? >> That's far more heavyweight than the current implementation, and >> minimizing CPU usage inside pgbench is a concern. It would be >> interesting to do some pgbench runs with this patch, or the final >> patch, and see what effect it has on the TPS numbers, if any, and I >> think we should. But the first concern is to minimize any negative >> impact, so let's talk about how to do that. > > Good point. One simple idea here would be to use a custom pgbench > script that has no SQL commands and just calculates the values of some > parameters to measure the impact without depending on the backend, > with a fixed number of transactions. Sure, we could do that. But whether it materially changes pgbench -S results, say, is a lot more important. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

On Tue, Feb 16, 2016 at 9:18 AM, Robert Haaswrote: > I experimented with trying to do this and ran into a problem: where > exactly would you store the evaluated arguments when you don't know > how many of them there will be? And even if you did know how many of > them there will be, wouldn't that mean that evalFunc or evaluateExpr > would have to palloc a buffer of the correct size for each invocation? > That's far more heavyweight than the current implementation, and > minimizing CPU usage inside pgbench is a concern. It would be > interesting to do some pgbench runs with this patch, or the final > patch, and see what effect it has on the TPS numbers, if any, and I > think we should. But the first concern is to minimize any negative > impact, so let's talk about how to do that. Good point. One simple idea here would be to use a custom pgbench script that has no SQL commands and just calculates the values of some parameters to measure the impact without depending on the backend, with a fixed number of transactions. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

On Mon, Feb 15, 2016 at 4:58 AM, Fabien COELHOwrote: > Indeed! > >> Taking patch 1 as a completely independent thing, there is no need to >> introduce PgBenchValueType yet. Similar remark for setIntValue and >> coerceToInt. They are useful afterwards when introducing double types to be >> able to handle double input parameters for the gaussian and other functions. > > Yes. This is exactly the pain I'm trying to avoid, creating a different > implementation for the first patch, which is just overriden when the second > part is applied... Splitting a patch means breaking it into independently committable sub-patches, not just throwing each line of the diff into a different pile as best you can. I'm with Michael: that part doesn't belong in this patch. If we want to have an infrastructure refactoring patch that just replaces every relevant use of int64 with PgBenchValue, a union supporting only integer values, then we can do that first and have a later patch introduce double as a separate change. But we can't have things that are logically part of patch #2 just tossed in with patch #1. I was in the middle of ripping that out of the patch when I realized that evalFunc() is pretty badly designed. What you've done is made it the job of each particular function to evaluate its arguments. I don't think that's a good plan. I think that when we discover that we've got a function, we should evaluate all of the arguments that were passed to it using common code that is shared across all types of functions and operators. Then, the evaluated arguments should be passed to the function-specific code, which can do as it likes with them. This way, you have less code that is specific to particular operations and more common code, which is generally a good thing. Every expression evaluation engine that I've ever heard of works this way - see, for example, the PostgreSQL backend. I experimented with trying to do this and ran into a problem: where exactly would you store the evaluated arguments when you don't know how many of them there will be? And even if you did know how many of them there will be, wouldn't that mean that evalFunc or evaluateExpr would have to palloc a buffer of the correct size for each invocation? That's far more heavyweight than the current implementation, and minimizing CPU usage inside pgbench is a concern. It would be interesting to do some pgbench runs with this patch, or the final patch, and see what effect it has on the TPS numbers, if any, and I think we should. But the first concern is to minimize any negative impact, so let's talk about how to do that. I think we should limit the number of arguments to a function to, say, 16, so that an array of int64s or PgBenchValues long enough to hold an entire argument list can be stack-allocated. The backend's limit is higher, but the only reason we need a value higher than 2 here is because you've chosen to introduce variable-argument functions; but I think 16 is enough for any practical purpose. Then, I think we should also change the parse tree representation so that transforms the linked-list into an array stored within the PgBenchExpr, so that you can access the expression for argument i as expr->u.arg[i]. Then we can write this is a loop that evaluates each expression in an array of expressions and stores the result in an array of values. That seems like it would be both cleaner and faster than what you've got here right now. It's also more similar to what you did with the function name itself: the most trivial thing the parser could do is store a pointer to the function or operator name, but that would be slow, so instead it looks up the function and stores a constant. I also went over your documentation changes. I think you inserted the new table smack dab in the middle of a section in a place that it doesn't really belong. The version attached makes it into its own , puts it in a new section a bit further down so that it doesn't break up the flow, and has a few other tweaks that I think are improvements. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index ade1b53..f39f341 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -786,7 +786,7 @@ pgbench options dbname - + \set varname expression @@ -798,8 +798,10 @@ pgbench options dbname The expression may contain integer constants such as 5432, references to variables :variablename, and expressions composed of unary (-) or binary operators - (+, -, *, /, %) - with their usual associativity, and parentheses. + (+, -, *, /, + %) with their usual associativity, + function calls, and + parentheses. @@ -994,6 +996,62 @@ END; + + Built-In Functions + + + The following functions are built into

### Re: [HACKERS] extend pgbench expressions with functions

Hello Michaël, + * Recursive evaluation of int or double expressions + * + * Note that currently only integer variables are available, with values + * stored as text. This comment is incorrect, we only care about integers in this patch. Indeed! Taking patch 1 as a completely independent thing, there is no need to introduce PgBenchValueType yet. Similar remark for setIntValue and coerceToInt. They are useful afterwards when introducing double types to be able to handle double input parameters for the gaussian and other functions. Yes. This is exactly the pain I'm trying to avoid, creating a different implementation for the first patch, which is just overriden when the second part is applied... So I'm trying to compromise, having a several part patch *but* having the infrastructure ready for the second patch which adds the double type. Note that the first patch without the second is a loss of time for everyone, as the nearly only useful functions are the randoms, which require a double argument, so it does not make sense to apply the first one if the second one is not to be applied, I think. [...] (INT64_MIN / -1) should error. (INT64_MIN % -1) should result in 0. This is missing the division handling. Oops, indeed I totally messed up when merging the handling of / and %:-( I have found another issue in the (a) patch: the internal scripts were using the future random function which do not yet exist, as they are in patch (b). Here is a three part v30, which still includes the infrastructure for future types in the a patch, see my argumentation above. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index ade1b53..9d5eb32 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -798,8 +798,11 @@ pgbench options dbname The expression may contain integer constants such as 5432, references to variables :variablename, and expressions composed of unary (-) or binary operators - (+, -, *, /, %) - with their usual associativity, and parentheses. + (+, -, *, /, + %) with their usual associativity, function calls and + parentheses. + shows the available + functions. @@ -965,6 +968,52 @@ f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter)) + + +PgBench Functions + + + + Function + Return Type + Description + Example + Result + + + + + abs(a) + same as a + integer value + abs(-17) + 17 + + + debug(a) + same asa + print to stderr the given argument + debug(5432) + 5432 + + + max(i [, ... ] ) + integer + maximum value + max(5, 4, 3, 2) + 5 + + + min(i [, ... ] ) + integer + minimum value + min(5, 4, 3, 2) + 2 + + + + + As an example, the full definition of the built-in TPC-B-like transaction is: diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y index 06ee04b..93c6173 100644 --- a/src/bin/pgbench/exprparse.y +++ b/src/bin/pgbench/exprparse.y @@ -16,10 +16,13 @@ PgBenchExpr *expr_parse_result; +static PgBenchExprList *make_elist(PgBenchExpr *exp, PgBenchExprList *list); static PgBenchExpr *make_integer_constant(int64 ival); static PgBenchExpr *make_variable(char *varname); -static PgBenchExpr *make_op(char operator, PgBenchExpr *lexpr, +static PgBenchExpr *make_op(const char *operator, PgBenchExpr *lexpr, PgBenchExpr *rexpr); +static int find_func(const char *fname); +static PgBenchExpr *make_func(const int fnumber, PgBenchExprList *args); %} @@ -31,13 +34,15 @@ static PgBenchExpr *make_op(char operator, PgBenchExpr *lexpr, int64 ival; char *str; PgBenchExpr *expr; + PgBenchExprList *elist; } +%type elist %type expr -%type INTEGER -%type VARIABLE +%type INTEGER function +%type VARIABLE FUNCTION -%token INTEGER VARIABLE +%token INTEGER VARIABLE FUNCTION %token CHAR_ERROR /* never used, will raise a syntax error */ /* Precedence: lowest to highest */ @@ -49,16 +54,25 @@ static PgBenchExpr *make_op(char operator, PgBenchExpr *lexpr, result: expr{ expr_parse_result = $1; } +elist: { $$ = NULL; } + | expr { $$ = make_elist($1, NULL); } + | elist ',' expr { $$ = make_elist($3, $1); } + ; + expr: '(' expr ')' { $$ = $2; } | '+' expr %prec UMINUS { $$ = $2; } - | '-' expr %prec UMINUS { $$ = make_op('-', make_integer_constant(0), $2); } - | expr '+' expr { $$ = make_op('+', $1, $3); } - | expr '-' expr { $$ = make_op('-', $1, $3); } - | expr '*' expr { $$ = make_op('*', $1, $3); } - | expr '/' expr { $$ = make_op('/', $1, $3); } - | expr '%' expr { $$ = make_op('%', $1, $3); } + | '-' expr %prec UMINUS { $$ = make_op("-", make_integer_constant(0), $2); } + |

### Re: [HACKERS] extend pgbench expressions with functions

On Mon, Feb 15, 2016 at 5:21 AM, Robert Haaswrote: > On Sun, Feb 14, 2016 at 11:28 AM, Fabien COELHO wrote: >> Here is a 3 part v29: >> >> a) add support for integer functions in pgbench, including turning >>operators as functions, as well as some minimal infrastructure for >>additional types (this allows to minimize the diffs with the next >>patch which adds double). >> >> b) add support for doubles, including setting double variables. >>Note that variable are not explicitely typed. Add some >>double-related functions, most interesting of them for pgbench >>are the randoms. >> >> c) remove \setrandom support (as thanks to part b \set x random(...) does >>the same). Prints an error pointing to the replacement if \setrandom is >>used in a pgbench script. > > Thanks, I'll review these as soon as I can get to it. I got around to look at (a) in this set. + if ((PGBENCH_FUNCTIONS[fnumber].nargs >= 0 && +PGBENCH_FUNCTIONS[fnumber].nargs != elist_length(args)) || + /* check at least one arg for min & max */ + (PGBENCH_FUNCTIONS[fnumber].nargs == -1 && +elist_length(args) == 0)) + expr_yyerror_more("unexpected number of arguments", + PGBENCH_FUNCTIONS[fnumber].fname); We could split that into two parts, each one with a more precise error message: - For functions that expect at least one argument: "at least one argument was expected, none found". - For functions that expect N arguments: "N arguments expected, but M found" + "\\set aid random(1, " CppAsString2(naccounts) " * :scale)\n" "SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n" } }; - /* Function prototypes */ Noise. + * Recursive evaluation of int or double expressions + * + * Note that currently only integer variables are available, with values + * stored as text. This comment is incorrect, we only care about integers in this patch. Taking patch 1 as a completely independent thing, there is no need to introduce PgBenchValueType yet. Similar remark for setIntValue and coerceToInt. They are useful afterwards when introducing double types to be able to handle double input parameters for the gaussian and other functions. - /* -* INT64_MIN / -1 is problematic, since the result -* can't be represented on a two's-complement machine. -* Some machines produce INT64_MIN, some produce zero, -* some throw an exception. We can dodge the problem -* by recognizing that division by -1 is the same as -* negation. -*/ - if (rval == -1) + if (coerceToInt() == -1) { - *retval = -lval; - - /* overflow check (needed for INT64_MIN) */ - if (lval == PG_INT64_MIN) - { - fprintf(stderr, "bigint out of range\n"); - return false; - } + setIntValue(retval, 0); + return true; } (INT64_MIN / -1) should error. (INT64_MIN % -1) should result in 0. This is missing the division handling. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

On Sun, Feb 14, 2016 at 11:28 AM, Fabien COELHOwrote: > Here is a 3 part v29: > > a) add support for integer functions in pgbench, including turning >operators as functions, as well as some minimal infrastructure for >additional types (this allows to minimize the diffs with the next >patch which adds double). > > b) add support for doubles, including setting double variables. >Note that variable are not explicitely typed. Add some >double-related functions, most interesting of them for pgbench >are the randoms. > > c) remove \setrandom support (as thanks to part b \set x random(...) does >the same). Prints an error pointing to the replacement if \setrandom is >used in a pgbench script. Thanks, I'll review these as soon as I can get to it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

Hello Michaël, I'll be happy if you do the review of the resulting split. OK, I am fine with this scenario as well. I have luckily done nothing yet. Here is a 3 part v29: a) add support for integer functions in pgbench, including turning operators as functions, as well as some minimal infrastructure for additional types (this allows to minimize the diffs with the next patch which adds double). b) add support for doubles, including setting double variables. Note that variable are not explicitely typed. Add some double-related functions, most interesting of them for pgbench are the randoms. c) remove \setrandom support (as thanks to part b \set x random(...) does the same). Prints an error pointing to the replacement if \setrandom is used in a pgbench script. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index ade1b53..9d5eb32 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -798,8 +798,11 @@ pgbench options dbname The expression may contain integer constants such as 5432, references to variables :variablename, and expressions composed of unary (-) or binary operators - (+, -, *, /, %) - with their usual associativity, and parentheses. + (+, -, *, /, + %) with their usual associativity, function calls and + parentheses. + shows the available + functions. @@ -965,6 +968,52 @@ f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter)) + + +PgBench Functions + + + + Function + Return Type + Description + Example + Result + + + + + abs(a) + same as a + integer value + abs(-17) + 17 + + + debug(a) + same asa + print to stderr the given argument + debug(5432) + 5432 + + + max(i [, ... ] ) + integer + maximum value + max(5, 4, 3, 2) + 5 + + + min(i [, ... ] ) + integer + minimum value + min(5, 4, 3, 2) + 2 + + + + + As an example, the full definition of the built-in TPC-B-like transaction is: diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y index 06ee04b..93c6173 100644 --- a/src/bin/pgbench/exprparse.y +++ b/src/bin/pgbench/exprparse.y @@ -16,10 +16,13 @@ PgBenchExpr *expr_parse_result; +static PgBenchExprList *make_elist(PgBenchExpr *exp, PgBenchExprList *list); static PgBenchExpr *make_integer_constant(int64 ival); static PgBenchExpr *make_variable(char *varname); -static PgBenchExpr *make_op(char operator, PgBenchExpr *lexpr, +static PgBenchExpr *make_op(const char *operator, PgBenchExpr *lexpr, PgBenchExpr *rexpr); +static int find_func(const char *fname); +static PgBenchExpr *make_func(const int fnumber, PgBenchExprList *args); %} @@ -31,13 +34,15 @@ static PgBenchExpr *make_op(char operator, PgBenchExpr *lexpr, int64 ival; char *str; PgBenchExpr *expr; + PgBenchExprList *elist; } +%type elist %type expr -%type INTEGER -%type VARIABLE +%type INTEGER function +%type VARIABLE FUNCTION -%token INTEGER VARIABLE +%token INTEGER VARIABLE FUNCTION %token CHAR_ERROR /* never used, will raise a syntax error */ /* Precedence: lowest to highest */ @@ -49,16 +54,25 @@ static PgBenchExpr *make_op(char operator, PgBenchExpr *lexpr, result: expr{ expr_parse_result = $1; } +elist: { $$ = NULL; } + | expr { $$ = make_elist($1, NULL); } + | elist ',' expr { $$ = make_elist($3, $1); } + ; + expr: '(' expr ')' { $$ = $2; } | '+' expr %prec UMINUS { $$ = $2; } - | '-' expr %prec UMINUS { $$ = make_op('-', make_integer_constant(0), $2); } - | expr '+' expr { $$ = make_op('+', $1, $3); } - | expr '-' expr { $$ = make_op('-', $1, $3); } - | expr '*' expr { $$ = make_op('*', $1, $3); } - | expr '/' expr { $$ = make_op('/', $1, $3); } - | expr '%' expr { $$ = make_op('%', $1, $3); } + | '-' expr %prec UMINUS { $$ = make_op("-", make_integer_constant(0), $2); } + | expr '+' expr { $$ = make_op("+", $1, $3); } + | expr '-' expr { $$ = make_op("-", $1, $3); } + | expr '*' expr { $$ = make_op("*", $1, $3); } + | expr '/' expr { $$ = make_op("/", $1, $3); } + | expr '%' expr { $$ = make_op("%", $1, $3); } | INTEGER{ $$ = make_integer_constant($1); } | VARIABLE { $$ = make_variable($1); } + | function '(' elist ')'{ $$ = make_func($1, $3); } + ; + +function: FUNCTION { $$ = find_func($1); pg_free($1); } ; %% @@ -68,8 +82,9 @@ make_integer_constant(int64 ival) { PgBenchExpr *expr = pg_malloc(sizeof(PgBenchExpr)); - expr->etype = ENODE_INTEGER_CONSTANT; - expr->u.integer_constant.ival = ival; + expr->etype = ENODE_CONSTANT; + expr->u.constant.type = PGBT_INT; + expr->u.constant.u.ival = ival; return expr; } @@ -84,14 +99,128

### Re: [HACKERS] extend pgbench expressions with functions

On Sun, Feb 14, 2016 at 4:42 PM, Fabien COELHOwrote: > >> So I would be fine to do a portion of the legwork and extract from this >> patch something smaller that adds only functions as a first step, with the >> minimum set of functions I mentioned upthread. Robert, Alvaro, Fabien, does >> that sound fine to you? > > > Thanks, but this is my c*, I have a few hours of travelling this evening, > I'll do it then. > > I'll be happy if you do the review of the resulting split. OK, I am fine with this scenario as well. I have luckily done nothing yet. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

On Sat, Feb 13, 2016 at 4:37 PM, Fabien COELHOwrote: > >> For example, I just realized that this patch allows values to be >> either a double or an integer and extends the operators to handle >> double values. But variables can still only be integers. > > Indeed. That's exactly the first impression I got about this patch when sending [1]. [1]: http://www.postgresql.org/message-id/cab7npqrhvfvnrahamaroniedec90pf3-wn+u5ccb4red-ht...@mail.gmail.com By focusing only on integers the patch would largely gain in simplicity. Now being able to have double values is actually useful for nested function calls, nothing else. >> I don't think variables should be explicitly typed but they should be able >> to store a value of any type that expression evaluation can generate. > > Doubles are not really needed that much, it is just to provide something to > random_* functions parameter, otherwise it is useless as far as pgbench is > really concerned. +1. >> Also, as I said back in November, there's really two completely >> separate enhancements in here. One of them is to support a new data >> type (doubles) and the other is to support functions. > > Yep. The first part is precisely the patch I initially submitted 5 CF ago. > Then I'm asked to put more things in it to show that it can indeed handle > another type. Then I'm told "you should not have done that". What can I say? I think that you did your job here by answering the comments of all the reviewers that had a look at this patch. That's not an easy task. >> [...] I find implementing operators as functions in disguise not to be one >> of PostgreSQL's most awesome design decisions, and here we are copying that >> into pgbench for, well, no benefit that I can see, really. > > Well, I did that initially, then I was asked to implements operators as > functions. It probably saves some lines, so it is not too bad, even if the > benefit is limited. Put the blame on me for this one then. I suggested this idea because Postgres is doing the same, and because it simplifies slightly the union structure in charge of holding the parsed structures, making the code a bit more readable IMO. >> [...] If neither of you are willing to split this patch, I'm not willing >> to commit it. > > If I'm reading you correctly, you would consider committing it: > > - if the function & double stuff are separated ? > - for the double part, if variables can be double ? I just double-checked and could not see a clear use case mentioned in this thread for double return types, so I would suggest focusing on the integer portion with min(), max(), abs(), debug() and the existing functions refactored. That's what your first versions did. If someone is wishing to implement double types, this someone could do it, the infrastructure that this patch puts in place has already proved that it can be easily extended. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

Hello Michaël, - if the function & double stuff are separated ? - for the double part, if variables can be double ? I just double-checked and could not see a clear use case mentioned in this thread for double return types, Alas there is one: non uniform random functions which use a double parameter. Once random functions are there, the \setrandom horror code could be removed, which would be a real benefit, IMO:-) So I see a good case to have some support for doubles. so I would suggest focusing on the integer portion with min(), max(), abs(), debug() and the existing functions refactored. That's what your first versions did. If someone is wishing to implement double types, this someone could do it, the infrastructure that this patch puts in place has already proved that it can be easily extended. Adding double is not too big a deal. I just stopped at variables because I could not see any realistic use for them. My idea was to postpone that till it is actually needed, "never" being the most probable course. Now if this is a blocker for the committer, then I will probably make the effort whatever I think of the usefulness of the feature. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

But variables can still only be integers. Version 28 attached has double variables, although typing is based on guessing because it is just a string. Any comment? [...] two separate enhancements in here. One of them is to support a new data type (doubles) and the other is to support functions. The two features are highly intermix, so it can only be dependent patches, first to add a function infrastructure and probably some support for doubles altough it would not be used, then to add doubles & their functions. A real pain is the documentation, because it means writing a documentation with only integer functions, then overwriting it with doubles. This is dumb work, really, for the sake of "a cleaner git history", the beauty of it no one will ever contemplate... -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index ade1b53..5d38c6f 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -793,20 +793,24 @@ pgbench options dbname - Sets variable varname to an integer value calculated + Sets variable varname to a value calculated from expression. The expression may contain integer constants such as 5432, + double constants such as 3.14159, references to variables :variablename, and expressions composed of unary (-) or binary operators - (+, -, *, /, %) - with their usual associativity, and parentheses. + (+, -, *, /, + %) with their usual associativity, function calls and + parentheses. + shows the available + functions. Examples: \set ntellers 10 * :scale -\set aid (1021 * :aid) % (10 * :scale) + 1 +\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1 @@ -826,66 +830,35 @@ pgbench options dbname - By default, or when uniform is specified, all values in the - range are drawn with equal probability. Specifying gaussian - or exponential options modifies this behavior; each - requires a mandatory parameter which determines the precise shape of the - distribution. - + + + + \setrandom n 1 10 or \setrandom n 1 10 uniform + is equivalent to \set n random(1, 10) and uses a uniform + distribution. + + - - For a Gaussian distribution, the interval is mapped onto a standard - normal distribution (the classical bell-shaped Gaussian curve) truncated - at -parameter on the left and +parameter - on the right. - Values in the middle of the interval are more likely to be drawn. - To be precise, if PHI(x) is the cumulative distribution - function of the standard normal distribution, with mean mu - defined as (max + min) / 2.0, with - - f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) / -(2.0 * PHI(parameter) - 1.0) - - then value i between min and - max inclusive is drawn with probability: - f(i + 0.5) - f(i - 0.5). - Intuitively, the larger parameter, the more - frequently values close to the middle of the interval are drawn, and the - less frequently values close to the min and - max bounds. About 67% of values are drawn from the - middle 1.0 / parameter, that is a relative - 0.5 / parameter around the mean, and 95% in the middle - 2.0 / parameter, that is a relative - 1.0 / parameter around the mean; for instance, if - parameter is 4.0, 67% of values are drawn from the - middle quarter (1.0 / 4.0) of the interval (i.e. from - 3.0 / 8.0 to 5.0 / 8.0) and 95% from - the middle half (2.0 / 4.0) of the interval (second and - third quartiles). The minimum parameter is 2.0 for - performance of the Box-Muller transform. - + + +\setrandom n 1 10 exponential 3.0 is equivalent to +\set n random_exponential(1, 10, 3.0) and uses an +exponential distribution. + + - - For an exponential distribution, parameter - controls the distribution by truncating a quickly-decreasing - exponential distribution at parameter, and then - projecting onto integers between the bounds. - To be precise, with - -f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter)) - - Then value i between min and - max inclusive is drawn with probability: - f(x) - f(x + 1). - Intuitively, the larger parameter, the more - frequently values close to min are accessed, and the - less frequently values close to max are accessed. - The closer to 0 parameter, the flatter (more uniform) - the access distribution. - A crude approximation of the distribution is that the most frequent 1% - values in the range, close to min, are drawn - parameter% of the time. - parameter value must be strictly positive. + + +

### Re: [HACKERS] extend pgbench expressions with functions

On Sun, Feb 14, 2016 at 12:37 AM, Fabien COELHO wrote: > The two features are highly intermix, so it can only be dependent patches, > first to add a function infrastructure and probably some support for doubles > altough it would not be used, then to add doubles & their functions. > > A real pain is the documentation, because it means writing a documentation > with only integer functions, then overwriting it with doubles. This is dumb > work, really, for the sake of "a cleaner git history", the beauty of it no > one will ever contemplate... FWIW, I care a lot about splitting as much as possible patches where it is possible to have a clean history. So I would be fine to do a portion of the legwork and extract from this patch something smaller that adds only functions as a first step, with the minimum set of functions I mentioned upthread. Robert, Alvaro, Fabien, does that sound fine to you? -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

On Sat, Feb 13, 2016 at 6:19 PM, Michael Paquierwrote: > On Sun, Feb 14, 2016 at 12:37 AM, Fabien COELHO wrote: >> The two features are highly intermix, so it can only be dependent patches, >> first to add a function infrastructure and probably some support for doubles >> altough it would not be used, then to add doubles & their functions. >> >> A real pain is the documentation, because it means writing a documentation >> with only integer functions, then overwriting it with doubles. This is dumb >> work, really, for the sake of "a cleaner git history", the beauty of it no >> one will ever contemplate... > > FWIW, I care a lot about splitting as much as possible patches where > it is possible to have a clean history. So I would be fine to do a > portion of the legwork and extract from this patch something smaller > that adds only functions as a first step, with the minimum set of > functions I mentioned upthread. Robert, Alvaro, Fabien, does that > sound fine to you? I'd be delighted. I would really like to get this feature in, but I'm not going to do it if it requires an unreasonable amount of work on my part - and what you propose would help a lot. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

On Sat, Feb 13, 2016 at 10:37 AM, Fabien COELHOwrote: > A real pain is the documentation, because it means writing a documentation > with only integer functions, then overwriting it with doubles. This is dumb > work, really, for the sake of "a cleaner git history", the beauty of it no > one will ever contemplate... You know, you make comments like this pretty much every time anybody suggests that you should change anything in any patch you write. It doesn't matter whether the change is suggested by Heikki, or by Michael, or by Andres, or by me. You pretty much always come back and say something that amounts to "changing the patch I already wrote is a waste of time". That gets a little disheartening after a while. This community's procedure is that patches have to be reviewed and reach consensus in order to get committed, and in my opinion that is generally not "dumb" but rather something that enhances the end result. I value your contributions to the community and I hope you will continue making them, but I don't like it when people call my ideas dumb. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

Hello Robert, You know, you make comments like this pretty much every time anybody suggests that you should change anything in any patch you write. Well, not everytime... I'm tired of doing and undoing things: I do a limited A because I'm cautious not to spend too much time that would go down the drain. Then I'm told "do A+B" to show that A is worthwhile. Then I'm told "you should not have done B with A, but submit A for the infrastructure and then B for the additional features", which was precisely my initial intent... So this is really the going forward and backwards because of the process that makes me write these remarks. [...] I don't like it when people call my ideas dumb. Your idea is not dumb, I'm sorry if I have implied that, and I apologise. I like a clean history as much as everyone else. Having to unmix features and documentations is still "dumb work", even if the purpose is not dumb, I stand by this word for this part. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

So I would be fine to do a portion of the legwork and extract from this patch something smaller that adds only functions as a first step, with the minimum set of functions I mentioned upthread. Robert, Alvaro, Fabien, does that sound fine to you? Thanks, but this is my c*, I have a few hours of travelling this evening, I'll do it then. I'll be happy if you do the review of the resulting split. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

On Sun, Feb 14, 2016 at 10:05 AM, Robert Haaswrote: > On Sat, Feb 13, 2016 at 6:19 PM, Michael Paquier > wrote: >> On Sun, Feb 14, 2016 at 12:37 AM, Fabien COELHO wrote: >>> The two features are highly intermix, so it can only be dependent patches, >>> first to add a function infrastructure and probably some support for doubles >>> altough it would not be used, then to add doubles & their functions. >>> >>> A real pain is the documentation, because it means writing a documentation >>> with only integer functions, then overwriting it with doubles. This is dumb >>> work, really, for the sake of "a cleaner git history", the beauty of it no >>> one will ever contemplate... >> >> FWIW, I care a lot about splitting as much as possible patches where >> it is possible to have a clean history. So I would be fine to do a >> portion of the legwork and extract from this patch something smaller >> that adds only functions as a first step, with the minimum set of >> functions I mentioned upthread. Robert, Alvaro, Fabien, does that >> sound fine to you? > > I'd be delighted. I would really like to get this feature in, but I'm > not going to do it if it requires an unreasonable amount of work on my > part - and what you propose would help a lot. OK, I'll see about producing a patch then for this basic infrastructure, with the rest built on top of it as a secondary patch. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

On Fri, Feb 12, 2016 at 5:06 AM, Fabien COELHOwrote: > I think that this option is too much bother for the small internal purpose > at hand. Yeah, I'm really frustrated with this whole thread. There's been a huge amount of discussion on this patch, but I don't really feel like it's converging towards something I could actually commit. For example, I just realized that this patch allows values to be either a double or an integer and extends the operators to handle double values. But variables can still only be integers. Please name any programming language that has such a restriction. The only ones I can think of are command shells, and those at least flatten everything to string rather than integer so that you can store the value without loss of precision - just with loss of type-safety. I think designing this in this way is quite short-sighted. I don't think variables should be explicitly typed but they should be able to store a value of any type that expression evaluation can generate. Also, as I said back in November, there's really two completely separate enhancements in here. One of them is to support a new data type (doubles) and the other is to support functions. Those should really be separate patches. If neither of you are willing to split this patch, I'm not willing to commit it. Going over half of this patch at a time and getting it committed is going to be a lot of work for me, but I'm willing to do it. I'm not willing to go over the whole thing at once - that's going to take more time than I have, and produce what will in my opinion be an inferior commit history. If somebody else is willing to commit the whole thing as one patch, I'm not going to object, but I won't do it myself. I would not worry too much about the list thing at this point. I'm sure something simple is fine for that. I actually think the patch is now in decent shape as far as code-cleanliness is concerned, but I'm not sure we've really looked hard enough at the design. For example, I find implementing operators as functions in disguise not to be one of PostgreSQL's most awesome design decisions, and here we are copying that into pgbench for, well, no benefit that I can see, really. Maybe it's a good idea and maybe it's a bad idea, but how do we know? That stuff deserves more discussion. Code cleanup is good, so I do think it's good that a lot of effort has been put in there, but I don't think more code cleanup is what's going to get this patch over the finish line. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

Hello Michaël, Using a pointer to the tail of the list would make the code simple, and save a couple of lines. I did that, see v27 attached. Note that it does not save any lines, because the reverse function is removed, but then you need another type to keep the head & tail, the link type is not enough, and then you have to manage that stuff in the code. Whether it is "simpler" is debatable. It probably costs more tests when executed. However, it really saves having to answer the question "why is the list reversed?", which is definitely a win from my point of view:-) Another thing that could be considered is also to move list.c [...] I think that this option is too much bother for the small internal purpose at hand. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index ade1b53..eaa0889 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -796,17 +796,21 @@ pgbench options dbname Sets variable varname to an integer value calculated from expression. The expression may contain integer constants such as 5432, - references to variables :variablename, + double constants such as 3.14159, + references to integer variables :variablename, and expressions composed of unary (-) or binary operators - (+, -, *, /, %) - with their usual associativity, and parentheses. + (+, -, *, /, + %) with their usual associativity, function calls and + parentheses. + shows the available + functions. Examples: \set ntellers 10 * :scale -\set aid (1021 * :aid) % (10 * :scale) + 1 +\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1 @@ -826,66 +830,35 @@ pgbench options dbname - By default, or when uniform is specified, all values in the - range are drawn with equal probability. Specifying gaussian - or exponential options modifies this behavior; each - requires a mandatory parameter which determines the precise shape of the - distribution. - + + + + \setrandom n 1 10 or \setrandom n 1 10 uniform + is equivalent to \set n random(1, 10) and uses a uniform + distribution. + + - - For a Gaussian distribution, the interval is mapped onto a standard - normal distribution (the classical bell-shaped Gaussian curve) truncated - at -parameter on the left and +parameter - on the right. - Values in the middle of the interval are more likely to be drawn. - To be precise, if PHI(x) is the cumulative distribution - function of the standard normal distribution, with mean mu - defined as (max + min) / 2.0, with - - f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) / -(2.0 * PHI(parameter) - 1.0) - - then value i between min and - max inclusive is drawn with probability: - f(i + 0.5) - f(i - 0.5). - Intuitively, the larger parameter, the more - frequently values close to the middle of the interval are drawn, and the - less frequently values close to the min and - max bounds. About 67% of values are drawn from the - middle 1.0 / parameter, that is a relative - 0.5 / parameter around the mean, and 95% in the middle - 2.0 / parameter, that is a relative - 1.0 / parameter around the mean; for instance, if - parameter is 4.0, 67% of values are drawn from the - middle quarter (1.0 / 4.0) of the interval (i.e. from - 3.0 / 8.0 to 5.0 / 8.0) and 95% from - the middle half (2.0 / 4.0) of the interval (second and - third quartiles). The minimum parameter is 2.0 for - performance of the Box-Muller transform. - + + +\setrandom n 1 10 exponential 3.0 is equivalent to +\set n random_exponential(1, 10, 3.0) and uses an +exponential distribution. + + - - For an exponential distribution, parameter - controls the distribution by truncating a quickly-decreasing - exponential distribution at parameter, and then - projecting onto integers between the bounds. - To be precise, with - -f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter)) - - Then value i between min and - max inclusive is drawn with probability: - f(x) - f(x + 1). - Intuitively, the larger parameter, the more - frequently values close to min are accessed, and the - less frequently values close to max are accessed. - The closer to 0 parameter, the flatter (more uniform) - the access distribution. - A crude approximation of the distribution is that the most frequent 1% - values in the range, close to min, are drawn - parameter% of the time. - parameter value must be strictly positive. + + +\setrandom n 1 10 gaussian 2.0 is equivalent to +

### Re: [HACKERS] extend pgbench expressions with functions

Hello Robert, For example, I just realized that this patch allows values to be either a double or an integer and extends the operators to handle double values. But variables can still only be integers. Indeed. [...] at least flatten everything to string rather than integer so that you can store the value without loss of precision - just with loss of type-safety. I think designing this in this way is quite short-sighted. Note that I'm not responsible for this design, which is preexisting. Extending variables to be able to store doubles could also be done in another patch. I don't think variables should be explicitly typed but they should be able to store a value of any type that expression evaluation can generate. Doubles are not really needed that much, it is just to provide something to random_* functions parameter, otherwise it is useless as far as pgbench is really concerned. Also, as I said back in November, there's really two completely separate enhancements in here. One of them is to support a new data type (doubles) and the other is to support functions. Yep. The first part is precisely the patch I initially submitted 5 CF ago. Then I'm asked to put more things in it to show that it can indeed handle another type. Then I'm told "you should not have done that". What can I say? Those should really be separate patches. They could. [...] I find implementing operators as functions in disguise not to be one of PostgreSQL's most awesome design decisions, and here we are copying that into pgbench for, well, no benefit that I can see, really. Well, I did that initially, then I was asked to implements operators as functions. It probably saves some lines, so it is not too bad, even if the benefit is limited. Maybe it's a good idea and maybe it's a bad idea, but how do we know? This is just pgbench, a tool for testing performance by running dummy transactions, not a production thing, so I think that it really does not matter. There is no user visible changes wrt operators. [...] If neither of you are willing to split this patch, I'm not willing to commit it. If I'm reading you correctly, you would consider committing it: - if the function & double stuff are separated ? - for the double part, if variables can be double ? -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

On Thu, Feb 11, 2016 at 12:37 AM, Fabien COELHOwrote: > v26 attached implements these changes. +/* the argument list has been built in reverse order, it is fixed here */ +expr->u.function.args = reverse_elist(args); Hm. I may be missing something, but why is that necessary? This is basically doing a double-reversion to put all the arguments in the correct order when parsing the function arguments. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

On Fri, Feb 12, 2016 at 2:41 AM, Fabien COELHOwrote: > > Hello Michaël, > >> +/* the argument list has been built in reverse order, it is fixed >> here */ >> +expr->u.function.args = reverse_elist(args); >> Hm. I may be missing something, but why is that necessary? This is >> basically doing a double-reversion to put all the arguments in the >> correct order when parsing the function arguments. > > This is because the expression list is parsed left to right and the list is > built as a stack to avoid looking for the last argument to append the next > expression, but then the list is in reverse order at the end of parsing, so > it is reversed once to make it right. This way the complexity is kept as > O(n). > > If this is too much I can switch to O(n**2) by appending each expression at > the end of the list. (this one has been mentioned by Alvaro offlist) Using a pointer to the tail of the list would make the code simple, and save a couple of lines. Another thing that could be considered is also to move list.c and pg_list.h into src/common and reuse that. There are other frontend utilities that emulate the same kind of facilities, have a look for example at the other copycats in pg_dump and pg_basebackup. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

Hello Michaël, +/* the argument list has been built in reverse order, it is fixed here */ +expr->u.function.args = reverse_elist(args); Hm. I may be missing something, but why is that necessary? This is basically doing a double-reversion to put all the arguments in the correct order when parsing the function arguments. This is because the expression list is parsed left to right and the list is built as a stack to avoid looking for the last argument to append the next expression, but then the list is in reverse order at the end of parsing, so it is reversed once to make it right. This way the complexity is kept as O(n). If this is too much I can switch to O(n**2) by appending each expression at the end of the list. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

On Tue, Feb 9, 2016 at 5:06 AM, Alvaro Herrerawrote: > Fabien COELHO wrote: >> >> Hello, >> >> >>v23 attached, which does not change the message but does the other fixes. >> > >> >This doesn't apply anymore >> >> Indeed, but the latest version was really v25. >> >> >-- please rebase and submit to the next CF. >> >> I already provided it as v25 on Feb 1st. >> >> >I closed it here as returned with feedback. >> >> I turned it to "moved to next CF" as the patch is already on the queue. > > Ah, I didn't see v25 anywhere. What you did should be fine, thanks. I just had another look at this patch. + parameter% of the time. Nitpick: double space here. + switch (func) { [...] + } + default: + return false; } In evalFunc(), the default case in switch for the operator functions should never be reached. Adding for example Assert(0) is something to consider. PGBT_NONE and PGBENCH_NONE are used nowhere. Why not removing them or replace the code paths where they would be used by assertions to trigger errors for future developments? Other than that the patch looks in pretty good shape to me. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

Hello Michaël, + parameter% of the time. Nitpick: double space here. Ok. + default: + return false; } In evalFunc(), the default case in switch for the operator functions should never be reached. Adding for example Assert(0) is something to consider. Ok for Assert + a comment. PGBT_NONE and PGBENCH_NONE are used nowhere. Why not removing them Ok. v26 attached implements these changes. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index ade1b53..eaa0889 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -796,17 +796,21 @@ pgbench options dbname Sets variable varname to an integer value calculated from expression. The expression may contain integer constants such as 5432, - references to variables :variablename, + double constants such as 3.14159, + references to integer variables :variablename, and expressions composed of unary (-) or binary operators - (+, -, *, /, %) - with their usual associativity, and parentheses. + (+, -, *, /, + %) with their usual associativity, function calls and + parentheses. + shows the available + functions. Examples: \set ntellers 10 * :scale -\set aid (1021 * :aid) % (10 * :scale) + 1 +\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1 @@ -826,66 +830,35 @@ pgbench options dbname - By default, or when uniform is specified, all values in the - range are drawn with equal probability. Specifying gaussian - or exponential options modifies this behavior; each - requires a mandatory parameter which determines the precise shape of the - distribution. - + + + + \setrandom n 1 10 or \setrandom n 1 10 uniform + is equivalent to \set n random(1, 10) and uses a uniform + distribution. + + - - For a Gaussian distribution, the interval is mapped onto a standard - normal distribution (the classical bell-shaped Gaussian curve) truncated - at -parameter on the left and +parameter - on the right. - Values in the middle of the interval are more likely to be drawn. - To be precise, if PHI(x) is the cumulative distribution - function of the standard normal distribution, with mean mu - defined as (max + min) / 2.0, with - - f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) / -(2.0 * PHI(parameter) - 1.0) - - then value i between min and - max inclusive is drawn with probability: - f(i + 0.5) - f(i - 0.5). - Intuitively, the larger parameter, the more - frequently values close to the middle of the interval are drawn, and the - less frequently values close to the min and - max bounds. About 67% of values are drawn from the - middle 1.0 / parameter, that is a relative - 0.5 / parameter around the mean, and 95% in the middle - 2.0 / parameter, that is a relative - 1.0 / parameter around the mean; for instance, if - parameter is 4.0, 67% of values are drawn from the - middle quarter (1.0 / 4.0) of the interval (i.e. from - 3.0 / 8.0 to 5.0 / 8.0) and 95% from - the middle half (2.0 / 4.0) of the interval (second and - third quartiles). The minimum parameter is 2.0 for - performance of the Box-Muller transform. - + + +\setrandom n 1 10 exponential 3.0 is equivalent to +\set n random_exponential(1, 10, 3.0) and uses an +exponential distribution. + + - - For an exponential distribution, parameter - controls the distribution by truncating a quickly-decreasing - exponential distribution at parameter, and then - projecting onto integers between the bounds. - To be precise, with - -f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter)) - - Then value i between min and - max inclusive is drawn with probability: - f(x) - f(x + 1). - Intuitively, the larger parameter, the more - frequently values close to min are accessed, and the - less frequently values close to max are accessed. - The closer to 0 parameter, the flatter (more uniform) - the access distribution. - A crude approximation of the distribution is that the most frequent 1% - values in the range, close to min, are drawn - parameter% of the time. - parameter value must be strictly positive. + + +\setrandom n 1 10 gaussian 2.0 is equivalent to +\set n random_gaussian(1, 10, 2.0), and uses a gaussian +distribution. + + + + + See the documentation of these functions below for further information + about the precise shape of these distributions, depending on the value + of the

### Re: [HACKERS] extend pgbench expressions with functions

Hello, v23 attached, which does not change the message but does the other fixes. This doesn't apply anymore Indeed, but the latest version was really v25. -- please rebase and submit to the next CF. I already provided it as v25 on Feb 1st. I closed it here as returned with feedback. I turned it to "moved to next CF" as the patch is already on the queue. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

Fabien COELHO wrote: > > Hello Michaël, > > v23 attached, which does not change the message but does the other fixes. This doesn't apply anymore -- please rebase and submit to the next CF. I closed it here as returned with feedback. Thanks! -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

Fabien COELHO wrote: > > Hello, > > >>v23 attached, which does not change the message but does the other fixes. > > > >This doesn't apply anymore > > Indeed, but the latest version was really v25. > > >-- please rebase and submit to the next CF. > > I already provided it as v25 on Feb 1st. > > >I closed it here as returned with feedback. > > I turned it to "moved to next CF" as the patch is already on the queue. Ah, I didn't see v25 anywhere. What you did should be fine, thanks. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

On Mon, Feb 1, 2016 at 9:46 PM, Michael Paquierwrote: > OK, here are patches for 9.1~9.4. The main differences are that in > 9.3/9.4 int64 is used for the division operations, and in 9.2/9.1 > that's int32. In the latter case pgbench blows up the same way with > that: > \set i -2147483648 > \set i :i / -1 > select :i; > In those patches INT32_MIN/INT64_MIN need to be explicitly set as well > at the top of pgbench.c. I thing that's fine. I thing so too. Committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

On Wed, Feb 3, 2016 at 11:28 PM, Robert Haaswrote: > On Mon, Feb 1, 2016 at 9:46 PM, Michael Paquier > wrote: >> OK, here are patches for 9.1~9.4. The main differences are that in >> 9.3/9.4 int64 is used for the division operations, and in 9.2/9.1 >> that's int32. In the latter case pgbench blows up the same way with >> that: >> \set i -2147483648 >> \set i :i / -1 >> select :i; >> In those patches INT32_MIN/INT64_MIN need to be explicitly set as well >> at the top of pgbench.c. I thing that's fine. > > I thing so too. Committed. Thanks for thinging so. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

Yet another rebase, so as to propagate the same special case checks int for / and %. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index ade1b53..ebaf4e5 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -796,17 +796,21 @@ pgbench options dbname Sets variable varname to an integer value calculated from expression. The expression may contain integer constants such as 5432, - references to variables :variablename, + double constants such as 3.14159, + references to integer variables :variablename, and expressions composed of unary (-) or binary operators - (+, -, *, /, %) - with their usual associativity, and parentheses. + (+, -, *, /, + %) with their usual associativity, function calls and + parentheses. + shows the available + functions. Examples: \set ntellers 10 * :scale -\set aid (1021 * :aid) % (10 * :scale) + 1 +\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1 @@ -826,66 +830,35 @@ pgbench options dbname - By default, or when uniform is specified, all values in the - range are drawn with equal probability. Specifying gaussian - or exponential options modifies this behavior; each - requires a mandatory parameter which determines the precise shape of the - distribution. - + + + + \setrandom n 1 10 or \setrandom n 1 10 uniform + is equivalent to \set n random(1, 10) and uses a uniform + distribution. + + - - For a Gaussian distribution, the interval is mapped onto a standard - normal distribution (the classical bell-shaped Gaussian curve) truncated - at -parameter on the left and +parameter - on the right. - Values in the middle of the interval are more likely to be drawn. - To be precise, if PHI(x) is the cumulative distribution - function of the standard normal distribution, with mean mu - defined as (max + min) / 2.0, with - - f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) / -(2.0 * PHI(parameter) - 1.0) - - then value i between min and - max inclusive is drawn with probability: - f(i + 0.5) - f(i - 0.5). - Intuitively, the larger parameter, the more - frequently values close to the middle of the interval are drawn, and the - less frequently values close to the min and - max bounds. About 67% of values are drawn from the - middle 1.0 / parameter, that is a relative - 0.5 / parameter around the mean, and 95% in the middle - 2.0 / parameter, that is a relative - 1.0 / parameter around the mean; for instance, if - parameter is 4.0, 67% of values are drawn from the - middle quarter (1.0 / 4.0) of the interval (i.e. from - 3.0 / 8.0 to 5.0 / 8.0) and 95% from - the middle half (2.0 / 4.0) of the interval (second and - third quartiles). The minimum parameter is 2.0 for - performance of the Box-Muller transform. - + + +\setrandom n 1 10 exponential 3.0 is equivalent to +\set n random_exponential(1, 10, 3.0) and uses an +exponential distribution. + + - - For an exponential distribution, parameter - controls the distribution by truncating a quickly-decreasing - exponential distribution at parameter, and then - projecting onto integers between the bounds. - To be precise, with - -f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter)) - - Then value i between min and - max inclusive is drawn with probability: - f(x) - f(x + 1). - Intuitively, the larger parameter, the more - frequently values close to min are accessed, and the - less frequently values close to max are accessed. - The closer to 0 parameter, the flatter (more uniform) - the access distribution. - A crude approximation of the distribution is that the most frequent 1% - values in the range, close to min, are drawn - parameter% of the time. - parameter value must be strictly positive. + + +\setrandom n 1 10 gaussian 2.0 is equivalent to +\set n random_gaussian(1, 10, 2.0), and uses a gaussian +distribution. + + + + + See the documentation of these functions below for further information + about the precise shape of these distributions, depending on the value + of the parameter. @@ -965,18 +938,184 @@ f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter)) + + +PgBench Functions + + + + Function + Return Type + Description + Example + Result + + + + + abs(a) + same as a + integer or double

### Re: [HACKERS] extend pgbench expressions with functions

On Mon, Feb 1, 2016 at 10:34 PM, Robert Haaswrote: > On Sat, Jan 30, 2016 at 7:36 AM, Michael Paquier > wrote: >> On Fri, Jan 29, 2016 at 11:21 PM, Fabien COELHO wrote: >>> +/* overflow check (needed for INT64_MIN) */ >>> +if (lval != 0 && (*retval < 0 == lval < 0)) >>> >>> Why not use "if (lval == INT64_MIN)" instead of this complicated condition? >>> If it is really needed for some reason, I think that a comment could help. >> >> Checking for PG_INT64_MIN only would be fine as well, so let's do so. >> I thought honestly that we had better check if the result and the left >> argument are not of the same sign, but well. > > Committed and back-patched to 9.5. Doesn't apply further back. OK, here are patches for 9.1~9.4. The main differences are that in 9.3/9.4 int64 is used for the division operations, and in 9.2/9.1 that's int32. In the latter case pgbench blows up the same way with that: \set i -2147483648 \set i :i / -1 select :i; In those patches INT32_MIN/INT64_MIN need to be explicitly set as well at the top of pgbench.c. I thing that's fine. -- Michael diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c index 2111f16..84b6303 100644 --- a/contrib/pgbench/pgbench.c +++ b/contrib/pgbench/pgbench.c @@ -56,6 +56,10 @@ #ifndef INT64_MAX #define INT64_MAX INT64CONST(0x7FFF) #endif +#ifndef INT32_MIN +#define INT32_MIN (-0x7FFF-1) +#endif + /* * Multi-platform pthread implementations @@ -1152,13 +1156,37 @@ top: snprintf(res, sizeof(res), "%d", ope1 * ope2); else if (strcmp(argv[3], "/") == 0) { + int operes; + if (ope2 == 0) { fprintf(stderr, "%s: division by zero\n", argv[0]); st->ecnt++; return true; } - snprintf(res, sizeof(res), "%d", ope1 / ope2); + /* + * INT32_MIN / -1 is problematic, since the result can't + * be represented on a two's-complement machine. Some + * machines produce INT32_MIN, some produce zero, some + * throw an exception. We can dodge the problem by + * recognizing that division by -1 is the same as + * negation. + */ + if (ope2 == -1) + { + operes = -ope1; + + /* overflow check (needed for INT32_MIN) */ + if (ope1 == INT32_MIN) + { + fprintf(stderr, "integer out of range\n"); + st->ecnt++; + return true; + } + } + else + operes = ope1 / ope2; + snprintf(res, sizeof(res), "%d", operes); } else { diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c index 4e22695..062a32f 100644 --- a/contrib/pgbench/pgbench.c +++ b/contrib/pgbench/pgbench.c @@ -57,6 +57,10 @@ #ifndef INT64_MAX #define INT64_MAX INT64CONST(0x7FFF) #endif +#ifndef INT64_MIN +#define INT64_MIN (-INT64CONST(0x7FFF) - 1) +#endif + /* * Multi-platform pthread implementations @@ -1331,13 +1335,37 @@ top: snprintf(res, sizeof(res), INT64_FORMAT, ope1 * ope2); else if (strcmp(argv[3], "/") == 0) { + int64 operes; + if (ope2 == 0) { fprintf(stderr, "%s: division by zero\n", argv[0]); st->ecnt++; return true; } - snprintf(res, sizeof(res), INT64_FORMAT, ope1 / ope2); + /* + * INT64_MIN / -1 is problematic, since the result can't + * be represented on a two's-complement machine. Some + * machines produce INT64_MIN, some produce zero, some + * throw an exception. We can dodge the problem by + * recognizing that division by -1 is the same as + * negation. + */ + if (ope2 == -1) + { + operes = -ope1; + + /* overflow check (needed for INT64_MIN) */ + if (ope1 == INT64_MIN) + { + fprintf(stderr, "bigint out of range\n"); + st->ecnt++; + return true; + } + } + else + operes = ope1 / ope2; + snprintf(res, sizeof(res), INT64_FORMAT, operes); } else { -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

Hello Michaël, - remove the short macros (although IMO it is a code degradation) FWIW, I like this suggestion from Robert. I'm not especially found of macros, my main reserve is that because of the length of the function names this necessarily creates lines longer than 80 columns or awkward and repeated new lines within expressions, which are both ugly. +make_op(const char *operator, PgBenchExpr *lexpr, PgBenchExpr *rexpr) +{ + return make_func(find_func(operator), +/* beware that the list is reversed in make_func */ +make_elist(rexpr, make_elist(lexpr, NULL))); +} I think that this should use as argument the function ID, aka PGBENCH_ADD or similar instead of find_func() with an operator character. This saves a couple of instructions. Not that simply: The number is the index in the array of functions, not the enum value, they are currently off by one, and there may be the same function with different names for some reason some day, so I do not think it would be a good idea to enforce the order so that they are identical. Also this minor search is only executed when parsing the script. + * - nargs: number of arguments (-1 is a special value for min & max) My fault perhaps, it may be better to mention here that -1 means that min and max need at least one argument, and that the number of arguments is not fixed. Ok for a better comment. For mod() there is no need to have an error, returning 0 is fine. You can actually do it unconditionally when rval == -1. Ok. + setDoubleValue(retval, d < 0.0? -d: d); Nitpick: lack of spaces between the question mark. Ok. NONE is used nowhere, but I think that you could use it for an assertion check here: [...] Just replace the "none" message by Assert(type != PGBT_NONE) for example. I've added a use of the macro. Another remaining item: should support for \setrandom be dropped? As the patch is presented this remains intact. As you know my opinion is "yes", but I have not receive a clear go about that from a committer and I'm not motivated by removing and then re adding code to the patch. If nothing clear is said, I'll do a patch just for that one functions are added and submit it to the next CF. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 42d0667..d42208a 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -796,17 +796,21 @@ pgbench options dbname Sets variable varname to an integer value calculated from expression. The expression may contain integer constants such as 5432, - references to variables :variablename, + double constants such as 3.14159, + references to integer variables :variablename, and expressions composed of unary (-) or binary operators - (+, -, *, /, %) - with their usual associativity, and parentheses. + (+, -, *, /, + %) with their usual associativity, function calls and + parentheses. + shows the available + functions. Examples: \set ntellers 10 * :scale -\set aid (1021 * :aid) % (10 * :scale) + 1 +\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1 @@ -826,66 +830,35 @@ pgbench options dbname - By default, or when uniform is specified, all values in the - range are drawn with equal probability. Specifying gaussian - or exponential options modifies this behavior; each - requires a mandatory parameter which determines the precise shape of the - distribution. - + + + + \setrandom n 1 10 or \setrandom n 1 10 uniform + is equivalent to \set n random(1, 10) and uses a uniform + distribution. + + - - For a Gaussian distribution, the interval is mapped onto a standard - normal distribution (the classical bell-shaped Gaussian curve) truncated - at -parameter on the left and +parameter - on the right. - Values in the middle of the interval are more likely to be drawn. - To be precise, if PHI(x) is the cumulative distribution - function of the standard normal distribution, with mean mu - defined as (max + min) / 2.0, with - - f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) / -(2.0 * PHI(parameter) - 1.0) - - then value i between min and - max inclusive is drawn with probability: - f(i + 0.5) - f(i - 0.5). - Intuitively, the larger parameter, the more - frequently values close to the middle of the interval are drawn, and the - less frequently values close to the min and - max bounds. About 67% of values are drawn from the - middle 1.0 / parameter, that is a relative - 0.5 / parameter around the mean, and 95% in the middle - 2.0 / parameter, that is a relative - 1.0 / parameter

### Re: [HACKERS] extend pgbench expressions with functions

On Sat, Jan 30, 2016 at 7:36 AM, Michael Paquierwrote: > On Fri, Jan 29, 2016 at 11:21 PM, Fabien COELHO wrote: >> +/* overflow check (needed for INT64_MIN) */ >> +if (lval != 0 && (*retval < 0 == lval < 0)) >> >> Why not use "if (lval == INT64_MIN)" instead of this complicated condition? >> If it is really needed for some reason, I think that a comment could help. > > Checking for PG_INT64_MIN only would be fine as well, so let's do so. > I thought honestly that we had better check if the result and the left > argument are not of the same sign, but well. Committed and back-patched to 9.5. Doesn't apply further back. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

On Tue, Feb 2, 2016 at 1:35 PM, Michael Paquierwrote: > And now there are patches. Well, nobody has complained about that until now > except me... So we could live without patching back-branches, but it don't > think it hurts much to fix those holes. Meh, s/it don't/I don't/ -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

On Mon, Feb 1, 2016 at 9:46 PM, Michael Paquierwrote: > On Mon, Feb 1, 2016 at 10:34 PM, Robert Haas wrote: >> On Sat, Jan 30, 2016 at 7:36 AM, Michael Paquier >> wrote: >>> On Fri, Jan 29, 2016 at 11:21 PM, Fabien COELHO wrote: +/* overflow check (needed for INT64_MIN) */ +if (lval != 0 && (*retval < 0 == lval < 0)) Why not use "if (lval == INT64_MIN)" instead of this complicated condition? If it is really needed for some reason, I think that a comment could help. >>> >>> Checking for PG_INT64_MIN only would be fine as well, so let's do so. >>> I thought honestly that we had better check if the result and the left >>> argument are not of the same sign, but well. >> >> Committed and back-patched to 9.5. Doesn't apply further back. > > OK, here are patches for 9.1~9.4. The main differences are that in > 9.3/9.4 int64 is used for the division operations, and in 9.2/9.1 > that's int32. In the latter case pgbench blows up the same way with > that: > \set i -2147483648 > \set i :i / -1 > select :i; > In those patches INT32_MIN/INT64_MIN need to be explicitly set as well > at the top of pgbench.c. I thing that's fine. Oh, gosh, I should have said more clearly that I didn't really see a need to fix this all the way back. But I guess we could. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

On Tue, Feb 2, 2016 at 1:24 PM, Robert Haaswrote: > On Mon, Feb 1, 2016 at 9:46 PM, Michael Paquier > wrote: > > On Mon, Feb 1, 2016 at 10:34 PM, Robert Haas > wrote: > >> On Sat, Jan 30, 2016 at 7:36 AM, Michael Paquier > >> wrote: > >>> On Fri, Jan 29, 2016 at 11:21 PM, Fabien COELHO > wrote: > +/* overflow check (needed for INT64_MIN) */ > +if (lval != 0 && (*retval < 0 == lval < 0)) > > Why not use "if (lval == INT64_MIN)" instead of this complicated > condition? > If it is really needed for some reason, I think that a comment could > help. > >>> > >>> Checking for PG_INT64_MIN only would be fine as well, so let's do so. > >>> I thought honestly that we had better check if the result and the left > >>> argument are not of the same sign, but well. > >> > >> Committed and back-patched to 9.5. Doesn't apply further back. > > > > OK, here are patches for 9.1~9.4. The main differences are that in > > 9.3/9.4 int64 is used for the division operations, and in 9.2/9.1 > > that's int32. In the latter case pgbench blows up the same way with > > that: > > \set i -2147483648 > > \set i :i / -1 > > select :i; > > In those patches INT32_MIN/INT64_MIN need to be explicitly set as well > > at the top of pgbench.c. I thing that's fine. > > Oh, gosh, I should have said more clearly that I didn't really see a > need to fix this all the way back. But I guess we could. > And now there are patches. Well, nobody has complained about that until now except me... So we could live without patching back-branches, but it don't think it hurts much to fix those holes. -- Michael

### Re: [HACKERS] extend pgbench expressions with functions

v22 compared to previous: - remove the short macros (although IMO it is a code degradation) - try not to remove/add blanks lines - let some assert "as is" - still exit on float to int overflow, see arguments in other mails - check for INT64_MIN / -1 (although I think it is useless) -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 42d0667..d42208a 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -796,17 +796,21 @@ pgbench options dbname Sets variable varname to an integer value calculated from expression. The expression may contain integer constants such as 5432, - references to variables :variablename, + double constants such as 3.14159, + references to integer variables :variablename, and expressions composed of unary (-) or binary operators - (+, -, *, /, %) - with their usual associativity, and parentheses. + (+, -, *, /, + %) with their usual associativity, function calls and + parentheses. + shows the available + functions. Examples: \set ntellers 10 * :scale -\set aid (1021 * :aid) % (10 * :scale) + 1 +\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1 @@ -826,66 +830,35 @@ pgbench options dbname - By default, or when uniform is specified, all values in the - range are drawn with equal probability. Specifying gaussian - or exponential options modifies this behavior; each - requires a mandatory parameter which determines the precise shape of the - distribution. - + + + + \setrandom n 1 10 or \setrandom n 1 10 uniform + is equivalent to \set n random(1, 10) and uses a uniform + distribution. + + - - For a Gaussian distribution, the interval is mapped onto a standard - normal distribution (the classical bell-shaped Gaussian curve) truncated - at -parameter on the left and +parameter - on the right. - Values in the middle of the interval are more likely to be drawn. - To be precise, if PHI(x) is the cumulative distribution - function of the standard normal distribution, with mean mu - defined as (max + min) / 2.0, with - - f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) / -(2.0 * PHI(parameter) - 1.0) - - then value i between min and - max inclusive is drawn with probability: - f(i + 0.5) - f(i - 0.5). - Intuitively, the larger parameter, the more - frequently values close to the middle of the interval are drawn, and the - less frequently values close to the min and - max bounds. About 67% of values are drawn from the - middle 1.0 / parameter, that is a relative - 0.5 / parameter around the mean, and 95% in the middle - 2.0 / parameter, that is a relative - 1.0 / parameter around the mean; for instance, if - parameter is 4.0, 67% of values are drawn from the - middle quarter (1.0 / 4.0) of the interval (i.e. from - 3.0 / 8.0 to 5.0 / 8.0) and 95% from - the middle half (2.0 / 4.0) of the interval (second and - third quartiles). The minimum parameter is 2.0 for - performance of the Box-Muller transform. - + + +\setrandom n 1 10 exponential 3.0 is equivalent to +\set n random_exponential(1, 10, 3.0) and uses an +exponential distribution. + + - - For an exponential distribution, parameter - controls the distribution by truncating a quickly-decreasing - exponential distribution at parameter, and then - projecting onto integers between the bounds. - To be precise, with - -f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter)) - - Then value i between min and - max inclusive is drawn with probability: - f(x) - f(x + 1). - Intuitively, the larger parameter, the more - frequently values close to min are accessed, and the - less frequently values close to max are accessed. - The closer to 0 parameter, the flatter (more uniform) - the access distribution. - A crude approximation of the distribution is that the most frequent 1% - values in the range, close to min, are drawn - parameter% of the time. - parameter value must be strictly positive. + + +\setrandom n 1 10 gaussian 2.0 is equivalent to +\set n random_gaussian(1, 10, 2.0), and uses a gaussian +distribution. + + + + + See the documentation of these functions below for further information + about the precise shape of these distributions, depending on the value + of the parameter. @@ -965,18 +938,184 @@ f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter)) + + +PgBench Functions +

### Re: [HACKERS] extend pgbench expressions with functions

On Fri, Jan 29, 2016 at 5:16 AM, Fabien COELHO < fabien.coe...@mines-paristech.fr> wrote: > v22 compared to previous: Thanks for the new patch! > - remove the short macros (although IMO it is a code degradation) FWIW, I like this suggestion from Robert. > - check for INT64_MIN / -1 (although I think it is useless) > - try not to remove/add blanks lines > - let some assert "as is" > - still exit on float to int overflow, see arguments in other mails +make_op(const char *operator, PgBenchExpr *lexpr, PgBenchExpr *rexpr) +{ + return make_func(find_func(operator), +/* beware that the list is reversed in make_func */ +make_elist(rexpr, make_elist(lexpr, NULL))); +} I think that this should use as argument the function ID, aka PGBENCH_ADD or similar instead of find_func() with an operator character. This saves a couple of instructions. + * - nargs: number of arguments (-1 is a special value for min & max) My fault perhaps, it may be better to mention here that -1 means that min and max need at least one argument, and that the number of arguments is not fixed. + case PGBENCH_MOD: + if (coerceToInt() == 0) { fprintf(stderr, "division by zero\n"); return false; } - *retval = lval % rval; + if (coerceToInt() == INT64_MIN && coerceToInt() == -1) + { + fprintf(stderr, "cannot divide INT64_MIN by -1\n"); + return false; + } For mod() there is no need to have an error, returning 0 is fine. You can actually do it unconditionally when rval == -1. + setDoubleValue(retval, d < 0.0? -d: d); Nitpick: lack of spaces between the question mark. +typedef enum +{ + PGBT_NONE, + PGBT_INT, + PGBT_DOUBLE +} PgBenchValueType; NONE is used nowhere, but I think that you could use it for an assertion check here: + if (retval->type == PGBT_INT) + fprintf(stderr, "int " INT64_FORMAT "\n", retval->u.ival); + else if (retval->type == PGBT_DOUBLE) + fprintf(stderr, "double %f\n", retval->u.dval); + else + fprintf(stderr, "none\n"); Just replace the "none" message by Assert(type != PGBT_NONE) for example. Another remaining item: should support for \setrandom be dropped? As the patch is presented this remains intact. -- Michael

### Re: [HACKERS] extend pgbench expressions with functions

On Fri, Jan 29, 2016 at 11:21 PM, Fabien COELHOwrote: > +/* overflow check (needed for INT64_MIN) */ > +if (lval != 0 && (*retval < 0 == lval < 0)) > > Why not use "if (lval == INT64_MIN)" instead of this complicated condition? > If it is really needed for some reason, I think that a comment could help. Checking for PG_INT64_MIN only would be fine as well, so let's do so. I thought honestly that we had better check if the result and the left argument are not of the same sign, but well. -- Michael diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index d5f242c..6a17990 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -967,7 +967,28 @@ evaluateExpr(CState *st, PgBenchExpr *expr, int64 *retval) fprintf(stderr, "division by zero\n"); return false; } - *retval = lval / rval; + /* + * INT64_MIN / -1 is problematic, since the result + * can't be represented on a two's-complement + * machine. Some machines produce INT64_MIN, some + * produce zero, some throw an exception. We can + * dodge the problem by recognizing that division + * by -1 is the same as negation. + */ + if (rval == -1) + { + *retval = -lval; + + /* overflow check (needed for INT64_MIN) */ + if (lval == PG_INT64_MIN) + { +fprintf(stderr, "bigint out of range\n"); +return false; + } + } + else + *retval = lval / rval; + return true; case '%': @@ -976,7 +997,15 @@ evaluateExpr(CState *st, PgBenchExpr *expr, int64 *retval) fprintf(stderr, "division by zero\n"); return false; } - *retval = lval % rval; + /* + * Some machines throw a floating-point exception + * for INT64_MIN % -1, the correct answer being + * zero in any case. + */ + if (rval == -1) + *retval = 0; + else + *retval = lval % rval; return true; } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

Also, a comment is needed to explain why such a bizarre condition is used/needed for just the INT64_MIN case. The last patch I sent has this bit: + /* + * Some machines throw a floating-point exception + * for INT64_MIN % -1, the correct answer being + * zero in any case. + */ How would you reformulate that à-la-Fabien? This one about modulo is fine. I was refering to this other one in the division case: +/* overflow check (needed for INT64_MIN) */ +if (lval != 0 && (*retval < 0 == lval < 0)) Why not use "if (lval == INT64_MIN)" instead of this complicated condition? If it is really needed for some reason, I think that a comment could help. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

On Fri, Jan 29, 2016 at 6:05 PM, Fabien COELHOwrote: > (instead of < in my previous suggestion, if some processors return 0 on > -INT64_MIN). Also, a comment is needed to explain why such a bizarre > condition is used/needed for just the INT64_MIN case. The last patch I sent has this bit: + /* + * Some machines throw a floating-point exception + * for INT64_MIN % -1, the correct answer being + * zero in any case. + */ How would you reformulate that à-la-Fabien? -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

Hi Michaël, I think it is overkill, but do as you feel. Perhaps we could have Robert decide on this one first? That's a bug after all that had better be backpatched. Fine with me. [modulo...] Right, forgot this one, we just need to check if rval is -1 here, and return 0 as result. I am updating the fix as attached. This looks to me like it works. I still feel that the condition should be simplified, probably with: if (lval < 0 && *resval <= 0) ... (instead of < in my previous suggestion, if some processors return 0 on -INT64_MIN). Also, a comment is needed to explain why such a bizarre condition is used/needed for just the INT64_MIN case. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

So I'm arguing that exiting, with an error message, is better than handling user errors. I'm not objecting to exiting with an error message, but I think letting ourselves be killed by a signal is no good. Ok, I understand this point for this purpose. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

Hello Michaël, v23 attached, which does not change the message but does the other fixes. +if (coerceToInt() == INT64_MIN && coerceToInt() == -1) +{ + fprintf(stderr, "cannot divide INT64_MIN by -1\n"); + return false; +} Bike-shedding: "bigint out of range" to match what is done in the backend? ISTM that it is clearer for the user to say that the issue is with the division? Otherwise the message does not help much. Well, not that it would be printed often... +/* + * Recursive evaluation of an expression in a pgbench script + * using the current state of variables. + * Returns whether the evaluation was ok, + * the value itself is returned through the retval pointer. + */ Could you reformat this comment? I can. fprintf(stderr, -"exponential parameter must be greater than zero (not \"%s\")\n", -argv[5]); + "exponential parameter must be greater than zero (not \"%s\")\n", + argv[5]); This is some unnecessary diff noise. Indeed. +setIntValue(retval, getGaussianRand(thread, arg1, arg2, dparam)); Some portions of the code are using tabs instead of spaces between function arguments. Indeed. I would as well suggest fixing first the (INT64_MAX / -1) crash on HEAD and back-branches with something like the patch attached, inspired from int8.c. I think it is overkill, but do as you feel. Note that it must also handle modulo, but the code you suggest cannot be used for that. #include int main(int argc, char* argv[]) { int64_t l = INT64_MIN; int64_t r = -1; int64_t d = l % r; return 0; } // => Floating point exception (core dumped) ISTM that the second condition can be simplified, as there is no possible issue if lval is positive? if (lval < 0 && *resval < 0) { ... } -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 42d0667..d42208a 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -796,17 +796,21 @@ pgbench options dbname Sets variable varname to an integer value calculated from expression. The expression may contain integer constants such as 5432, - references to variables :variablename, + double constants such as 3.14159, + references to integer variables :variablename, and expressions composed of unary (-) or binary operators - (+, -, *, /, %) - with their usual associativity, and parentheses. + (+, -, *, /, + %) with their usual associativity, function calls and + parentheses. + shows the available + functions. Examples: \set ntellers 10 * :scale -\set aid (1021 * :aid) % (10 * :scale) + 1 +\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1 @@ -826,66 +830,35 @@ pgbench options dbname - By default, or when uniform is specified, all values in the - range are drawn with equal probability. Specifying gaussian - or exponential options modifies this behavior; each - requires a mandatory parameter which determines the precise shape of the - distribution. - + + + + \setrandom n 1 10 or \setrandom n 1 10 uniform + is equivalent to \set n random(1, 10) and uses a uniform + distribution. + + - - For a Gaussian distribution, the interval is mapped onto a standard - normal distribution (the classical bell-shaped Gaussian curve) truncated - at -parameter on the left and +parameter - on the right. - Values in the middle of the interval are more likely to be drawn. - To be precise, if PHI(x) is the cumulative distribution - function of the standard normal distribution, with mean mu - defined as (max + min) / 2.0, with - - f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) / -(2.0 * PHI(parameter) - 1.0) - - then value i between min and - max inclusive is drawn with probability: - f(i + 0.5) - f(i - 0.5). - Intuitively, the larger parameter, the more - frequently values close to the middle of the interval are drawn, and the - less frequently values close to the min and - max bounds. About 67% of values are drawn from the - middle 1.0 / parameter, that is a relative - 0.5 / parameter around the mean, and 95% in the middle - 2.0 / parameter, that is a relative - 1.0 / parameter around the mean; for instance, if - parameter is 4.0, 67% of values are drawn from the - middle quarter (1.0 / 4.0) of the interval (i.e. from - 3.0 / 8.0 to 5.0 / 8.0) and 95% from - the middle half (2.0 / 4.0) of the interval (second and - third quartiles). The minimum parameter is 2.0 for - performance of the Box-Muller transform. -

### Re: [HACKERS] extend pgbench expressions with functions

On Fri, Jan 29, 2016 at 3:40 PM, Fabien COELHOwrote: > > I would as well suggest fixing first the (INT64_MAX / -1) crash on HEAD >> and back-branches with something like the patch attached, inspired from >> int8.c. >> > > I think it is overkill, but do as you feel. > Perhaps we could have Robert decide on this one first? That's a bug after all that had better be backpatched. > Note that it must also handle modulo, but the code you suggest cannot be > used for that. > > #include > int main(int argc, char* argv[]) > { > int64_t l = INT64_MIN; > int64_t r = -1; > int64_t d = l % r; > return 0; > } > // => Floating point exception (core dumped) > Right, forgot this one, we just need to check if rval is -1 here, and return 0 as result. I am updating the fix as attached. -- Michael diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index d5f242c..25b349d 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -967,7 +967,28 @@ evaluateExpr(CState *st, PgBenchExpr *expr, int64 *retval) fprintf(stderr, "division by zero\n"); return false; } - *retval = lval / rval; + /* + * INT64_MIN / -1 is problematic, since the result + * can't be represented on a two's-complement + * machine. Some machines produce INT64_MIN, some + * produce zero, some throw an exception. We can + * dodge the problem by recognizing that division + * by -1 is the same as negation. + */ + if (rval == -1) + { + *retval = -lval; + + /* overflow check (needed for INT64_MIN) */ + if (lval != 0 && (*retval < 0 == lval < 0)) + { +fprintf(stderr, "bigint out of range\n"); +return false; + } + } + else + *retval = lval / rval; + return true; case '%': @@ -976,7 +997,15 @@ evaluateExpr(CState *st, PgBenchExpr *expr, int64 *retval) fprintf(stderr, "division by zero\n"); return false; } - *retval = lval % rval; + /* + * Some machines throw a floating-point exception + * for INT64_MIN % -1, the correct answer being + * zero in any case. + */ + if (rval == -1) + *retval = 0; + else + *retval = lval % rval; return true; } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

v22 compared to previous: - remove the short macros (although IMO it is a code degradation) - try not to remove/add blanks lines - let some assert "as is" - still exit on float to int overflow, see arguments in other mails - check for INT64_MIN / -1 (although I think it is useless) -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 42d0667..d42208a 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -796,17 +796,21 @@ pgbench options dbname Sets variable varname to an integer value calculated from expression. The expression may contain integer constants such as 5432, - references to variables :variablename, + double constants such as 3.14159, + references to integer variables :variablename, and expressions composed of unary (-) or binary operators - (+, -, *, /, %) - with their usual associativity, and parentheses. + (+, -, *, /, + %) with their usual associativity, function calls and + parentheses. + shows the available + functions. Examples: \set ntellers 10 * :scale -\set aid (1021 * :aid) % (10 * :scale) + 1 +\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1 @@ -826,66 +830,35 @@ pgbench options dbname - By default, or when uniform is specified, all values in the - range are drawn with equal probability. Specifying gaussian - or exponential options modifies this behavior; each - requires a mandatory parameter which determines the precise shape of the - distribution. - + + + + \setrandom n 1 10 or \setrandom n 1 10 uniform + is equivalent to \set n random(1, 10) and uses a uniform + distribution. + + - - For a Gaussian distribution, the interval is mapped onto a standard - normal distribution (the classical bell-shaped Gaussian curve) truncated - at -parameter on the left and +parameter - on the right. - Values in the middle of the interval are more likely to be drawn. - To be precise, if PHI(x) is the cumulative distribution - function of the standard normal distribution, with mean mu - defined as (max + min) / 2.0, with - - f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) / -(2.0 * PHI(parameter) - 1.0) - - then value i between min and - max inclusive is drawn with probability: - f(i + 0.5) - f(i - 0.5). - Intuitively, the larger parameter, the more - frequently values close to the middle of the interval are drawn, and the - less frequently values close to the min and - max bounds. About 67% of values are drawn from the - middle 1.0 / parameter, that is a relative - 0.5 / parameter around the mean, and 95% in the middle - 2.0 / parameter, that is a relative - 1.0 / parameter around the mean; for instance, if - parameter is 4.0, 67% of values are drawn from the - middle quarter (1.0 / 4.0) of the interval (i.e. from - 3.0 / 8.0 to 5.0 / 8.0) and 95% from - the middle half (2.0 / 4.0) of the interval (second and - third quartiles). The minimum parameter is 2.0 for - performance of the Box-Muller transform. - + + +\setrandom n 1 10 exponential 3.0 is equivalent to +\set n random_exponential(1, 10, 3.0) and uses an +exponential distribution. + + - - For an exponential distribution, parameter - controls the distribution by truncating a quickly-decreasing - exponential distribution at parameter, and then - projecting onto integers between the bounds. - To be precise, with - -f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter)) - - Then value i between min and - max inclusive is drawn with probability: - f(x) - f(x + 1). - Intuitively, the larger parameter, the more - frequently values close to min are accessed, and the - less frequently values close to max are accessed. - The closer to 0 parameter, the flatter (more uniform) - the access distribution. - A crude approximation of the distribution is that the most frequent 1% - values in the range, close to min, are drawn - parameter% of the time. - parameter value must be strictly positive. + + +\setrandom n 1 10 gaussian 2.0 is equivalent to +\set n random_gaussian(1, 10, 2.0), and uses a gaussian +distribution. + + + + + See the documentation of these functions below for further information + about the precise shape of these distributions, depending on the value + of the parameter. @@ -965,18 +938,184 @@ f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 -

### Re: [HACKERS] extend pgbench expressions with functions

I do not think that it is really worth fixing, but I will not prevent anyone to fix it. I still think it does. Well, if there is consensus to address this one and optionally the other integer overflows even on back branches, I'll write a patch and let's call that a deal. This is not a problem from my side. My point is just about the cost-benefit of fixing a low probability issue that you can only encounter if you are looking for it, and not with any reasonable bench script. Now adding somewhere a test might just help closing the subject and do more useful things, so that would also be a win. /* these would raise an arithmetic error */ if (lval == INT64_MIN && rval == -1) { fprintf(stderr, "cannot divide or modulo INT64_MIN by -1\n"); return false; } This may be backpatched to old supported versions. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

On Thu, Jan 28, 2016 at 1:36 AM, Fabien COELHOwrote: > So I'm arguing that exiting, with an error message, is better than handling > user errors. I'm not objecting to exiting with an error message, but I think letting ourselves be killed by a signal is no good. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

Hello Robert, And this one generates a core dump: \set cid debug(-9223372036854775808 / -1) Floating point exception: 8 (core dumped) That does not seem acceptable to me. I don't want pgbench to die a horrible death if a floating point exception occurs any more than I want the server to do the same. I want the error to be properly caught and reported. Please note that this issue/bug/feature/whatever already exists just for these two values (1/2**128 probability), it is not related to this patch about functions. sh> pgbench --version pgbench (PostgreSQL) 9.5.0 sh> cat div.sql \set i -9223372036854775807 \set i :i - 1 \set i :i / -1 sh> pgbench -f div.sql starting vacuum...end. Floating point exception (core dumped) I do not think that it is really worth fixing, but I will not prevent anyone to fix it. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

On Thu, Jan 28, 2016 at 7:07 AM, Fabien COELHOwrote: > I do not think that it is really worth fixing, but I will not prevent anyone > to fix it. I still think it does. Well, if there is consensus to address this one and optionally the other integer overflows even on back branches, I'll write a patch and let's call that a deal. This is not a problem from my side. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

Hello Robert, Attached is a rebase after recent changes in pgbench code & doc. +/* use short names in the evaluator */ +#define INT(v) coerceToInt() +#define DOUBLE(v) coerceToDouble() +#define SET_INT(pv, ival) setIntValue(pv, ival) +#define SET_DOUBLE(pv, dval) setDoubleValue(pv, dval) I don't like this at all. It seems to me that this really obscures the code. The few extra characters are a small price to pay for not having to go look up the macro definition to understand what the code is doing. Hmmm. Postgres indentation rules for "switch" are peculiar to say the least and make it hard to write code that stay under 80 columns. The coerceToInt function name looks pretty long (I would rather have toInt/toDbl/setInt/setDbl) but I was "told" to use that, so I'm trying to find a tradeoff with a macro. Obviously I can substitude and have rather long lines that I personally find much uglier. The third hunk in pgbench.c unnecessary deletes a blank line. Yep, that is possible. /* * inner expresion in (cut, 1] (if parameter > 0), rand in [0, 1) +* Assert((1.0 - cut) != 0.0); */ - Assert((1.0 - cut) != 0.0); rand = -log(cut + (1.0 - cut) * uniform) / parameter; + Moving the Assert() into the comment seems like a bad plan. If the Assert is true, it shouldn't be commented out. If it's not, it shouldn't be there at all. I put this assertion when I initially wrote this code, but I think that it is proven so I moved it in comment just as a reminder for someone who might touch anything that this must hold. Commit e41beea0ddb74ef975f08b917a354ec33cb60830, which you wrote, went to some trouble to display good context for error messages. What you have here seems like a huge step backwards: + fprintf(stderr, "double to int overflow for %f\n", dval); + exit(1); So, we're just going to give up on all of that error context reporting that we added back then? That would be sad. Well, I'm a lazy programmer, so I'm trying to measure the benefit. IMO there is no benefit to better manage this case, especially as the various solution I thought of where either ugly and/or had a significant impact on the code. Note that in the best case the error would be detected and reported and the client is stopped, and other clients go on... But then, if you started a bench and some clients die while running probably your results are meaningless, so my opinion is that you are better off with an exit than with some message that you may miss and performance results computed with much less clients than you asked for. Pgbench is a bench tool, not a production tool. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

Hello Robert, Pgbench is a bench tool, not a production tool. I don't really see how that's relevant. My point is that I do not see any added value for pgbench to keep on executing a performance bench if some clients die due to script errors: it is more useful to stop the whole bench and report the issue, so the user can fix their script, than to keep going on with the remaining clients, from a benchmarking perspective. So I'm arguing that exiting, with an error message, is better than handling user errors. When I run a program and it dies after causing the operating system to send it a fatal signal, I say to myself "self, that program has a bug". Other people may have different reactions, but that's mine. I was talking about an exit call generated on a float to int conversion error when there is an error in the user script. The bug is in the user script and this is clearly reported by pgbench. However, your argument may be relevant for avoiding fatal signal such as generated by INT64_MAX / -1, because on this one the message error is terse so how to fix the issue is not clear to the user. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

On Wed, Jan 27, 2016 at 5:43 PM, Fabien COELHOwrote: > Pgbench is a bench tool, not a production tool. I don't really see how that's relevant. When I run a program and it dies after causing the operating system to send it a fatal signal, I say to myself "self, that program has a bug". Other people may have different reactions, but that's mine. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

On Sat, Jan 16, 2016 at 1:10 PM, Fabien COELHOwrote: > All these results are fine from my point of view. > >> And this one generates a core dump: >> \set cid debug(-9223372036854775808 / -1) >> Floating point exception: 8 (core dumped) That does not seem acceptable to me. I don't want pgbench to die a horrible death if a floating point exception occurs any more than I want the server to do the same. I want the error to be properly caught and reported. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

On Wed, Jan 27, 2016 at 3:39 AM, Fabien COELHOwrote: > Attached is a rebase after recent changes in pgbench code & doc. +/* use short names in the evaluator */ +#define INT(v) coerceToInt() +#define DOUBLE(v) coerceToDouble() +#define SET_INT(pv, ival) setIntValue(pv, ival) +#define SET_DOUBLE(pv, dval) setDoubleValue(pv, dval) I don't like this at all. It seems to me that this really obscures the code. The few extra characters are a small price to pay for not having to go look up the macro definition to understand what the code is doing. The third hunk in pgbench.c unnecessary deletes a blank line. /* * inner expresion in (cut, 1] (if parameter > 0), rand in [0, 1) +* Assert((1.0 - cut) != 0.0); */ - Assert((1.0 - cut) != 0.0); rand = -log(cut + (1.0 - cut) * uniform) / parameter; + Moving the Assert() into the comment seems like a bad plan. If the Assert is true, it shouldn't be commented out. If it's not, it shouldn't be there at all. Commit e41beea0ddb74ef975f08b917a354ec33cb60830, which you wrote, went to some trouble to display good context for error messages. What you have here seems like a huge step backwards: + fprintf(stderr, "double to int overflow for %f\n", dval); + exit(1); So, we're just going to give up on all of that error context reporting that we added back then? That would be sad. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

OK, so I had an extra look at this patch and I am marking it as ready for committer. Ok. Attached is a rebase after recent changes in pgbench code & doc. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 42d0667..d42208a 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -796,17 +796,21 @@ pgbench options dbname Sets variable varname to an integer value calculated from expression. The expression may contain integer constants such as 5432, - references to variables :variablename, + double constants such as 3.14159, + references to integer variables :variablename, and expressions composed of unary (-) or binary operators - (+, -, *, /, %) - with their usual associativity, and parentheses. + (+, -, *, /, + %) with their usual associativity, function calls and + parentheses. + shows the available + functions. Examples: \set ntellers 10 * :scale -\set aid (1021 * :aid) % (10 * :scale) + 1 +\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1 @@ -826,66 +830,35 @@ pgbench options dbname - By default, or when uniform is specified, all values in the - range are drawn with equal probability. Specifying gaussian - or exponential options modifies this behavior; each - requires a mandatory parameter which determines the precise shape of the - distribution. - + + + + \setrandom n 1 10 or \setrandom n 1 10 uniform + is equivalent to \set n random(1, 10) and uses a uniform + distribution. + + - - For a Gaussian distribution, the interval is mapped onto a standard - normal distribution (the classical bell-shaped Gaussian curve) truncated - at -parameter on the left and +parameter - on the right. - Values in the middle of the interval are more likely to be drawn. - To be precise, if PHI(x) is the cumulative distribution - function of the standard normal distribution, with mean mu - defined as (max + min) / 2.0, with - - f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) / -(2.0 * PHI(parameter) - 1.0) - - then value i between min and - max inclusive is drawn with probability: - f(i + 0.5) - f(i - 0.5). - Intuitively, the larger parameter, the more - frequently values close to the middle of the interval are drawn, and the - less frequently values close to the min and - max bounds. About 67% of values are drawn from the - middle 1.0 / parameter, that is a relative - 0.5 / parameter around the mean, and 95% in the middle - 2.0 / parameter, that is a relative - 1.0 / parameter around the mean; for instance, if - parameter is 4.0, 67% of values are drawn from the - middle quarter (1.0 / 4.0) of the interval (i.e. from - 3.0 / 8.0 to 5.0 / 8.0) and 95% from - the middle half (2.0 / 4.0) of the interval (second and - third quartiles). The minimum parameter is 2.0 for - performance of the Box-Muller transform. - + + +\setrandom n 1 10 exponential 3.0 is equivalent to +\set n random_exponential(1, 10, 3.0) and uses an +exponential distribution. + + - - For an exponential distribution, parameter - controls the distribution by truncating a quickly-decreasing - exponential distribution at parameter, and then - projecting onto integers between the bounds. - To be precise, with - -f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter)) - - Then value i between min and - max inclusive is drawn with probability: - f(x) - f(x + 1). - Intuitively, the larger parameter, the more - frequently values close to min are accessed, and the - less frequently values close to max are accessed. - The closer to 0 parameter, the flatter (more uniform) - the access distribution. - A crude approximation of the distribution is that the most frequent 1% - values in the range, close to min, are drawn - parameter% of the time. - parameter value must be strictly positive. + + +\setrandom n 1 10 gaussian 2.0 is equivalent to +\set n random_gaussian(1, 10, 2.0), and uses a gaussian +distribution. + + + + + See the documentation of these functions below for further information + about the precise shape of these distributions, depending on the value + of the parameter. @@ -965,18 +938,184 @@ f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter)) + + +PgBench Functions + + + + Function + Return Type + Description + Example + Result + + + +

### Re: [HACKERS] extend pgbench expressions with functions

The basic operator functions also do not check for integer overflows. This is a feature. I think that they should not check for overflow, as in C, this is just int64_t arithmetic "as is". (int64_t is not an available type on Windows btw.) Possibly. I really meant "64 bits signed integers", whatever its name. "int64_t" is the standard C99 name. Finally I can think of good reason to use overflows deliberately, so I think it would argue against such a change. Could you show up an example? I am curious about that. The one I can think of is the use of "SUM" to aggregate hashes for computing a hash on a table. If SUM would overflow and stop this would break the usage. Now there could be a case for having an overflow detection on SUM, but that would be another SUM, preferably with a distinct name. \set cid debug(9223372036854775807 * 9223372036854775807) debug(script=0,command=3): int 1 All these results are fine from my point of view. On HEAD we are getting similar strange results, Yep, this is not new. so I am fine to withdraw but that's still very strange to me. Arithmetic operator modulo are pretty strange, I can agree with that:-) The first case is generating -9223372036854775808, the second one compiles 9223372036854775807 and the third one generates 1. This should be the "real" result modulo 2**64, if I'm not mistaken. Or we make the error checks even more consistent in back-branches, perhaps that 's indeed not worth it for a client though. Yep, that would be another patch. And this one generates a core dump: \set cid debug(-9223372036854775808 / -1) Floating point exception: 8 (core dumped) This one is funny, but it is a fact of int64_t life: you cannot divide INT64_MIN by -1 because the result cannot be represented as an int64_t. This is propably hardcoded in the processor. I do not think it is worth doing anything about it for pgbench. This one, on the contrary, is generating an error on HEAD, and your patch is causing a crash: value "9223372036854775808" is out of range for type bigint That's a regression, no? Hmmm, yes, somehow, but just for this one value, if you try the next: pgbench 9.4.5: value "-9223372036854775809" is out of range for type bigint I guess that the implementation before 9.5 converted "-9223372036854775808" as an int, which is INT64_MIN, so it was fine. Now it is parsed as "operator uminus" applied to "9223372036854775808", which is not fine because this would be INT64_MAX+1, which is not possible. I would prefer just to neglect that as a very small (1/2**64) feature change rather than a meaningful regression, especially as the coding effort to fix this is significant and the value of handling it differently is nought. I am uncomfortable with the fact of letting such holes in the code, even if that's a client application. This is a 2**128 probability case which stops pgbench. It is obviously possible to add a check to catch it, and then generate an error message, but I would rather just ignore it and let pgbench stop on that. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

OK, so I had an extra look at this patch and I am marking it as ready for committer. Ok. - INT64_MIN / -1 throws a core dump, and errors on HEAD. I think this should be fixed, Fabien does not. Yep. Another point about this one is that it is not related to this patch about functions. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

On Mon, Jan 18, 2016 at 10:07 AM, Michael Paquierwrote: > On Sun, Jan 17, 2016 at 3:10 AM, Fabien COELHO wrote: >> I put the function evaluation in a function in the attached version. > > Thanks, this makes the code a bit clearer. OK, so I had an extra look at this patch and I am marking it as ready for committer. A couple of things to be aware of, and the result of this thread with the patch in its current state: - The patch is keeping \setrandom. Fabien and I are agreeing to purely remove it, though it is kept in the patch because it is easier to remove existing code rather than add it again per Fabien's concerns. - INT64_MIN / -1 throws a core dump, and errors on HEAD. I think this should be fixed, Fabien does not. - There are not many overflow checks for the exiting int64 operators and functions. HEAD doesn't do much, this patch makes it the situation a bit worse even if there are a couple of checks for int() for example. We do not do any checks for sqrt(-N) (N > 0) for example. - It may be more interesting to have the function execution code into a separate file for clarity. Not mandatory though. Except those comments, all the other issues have been addressed. I think this is a great patch, and greatly improves the extensibility of pgbench. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

On Sun, Jan 17, 2016 at 3:10 AM, Fabien COELHOwrote: >> With this example: >> \set cid debug(sqrt(-1)) >> I get that: >> debug(script=0,command=1): double nan >> An error would be more logical, no? > > > If "sqrt(-1)" as a double is Nan for the computer, I'm fine with that. It > makes the code simpler to just let the math library do its stuff and not > bother. Hm. OK.. >> The basic operator functions also do not check for integer overflows. > > > This is a feature. I think that they should not check for overflow, as in C, > this is just int64_t arithmetic "as is". (int64_t is not an available type on Windows btw.) > Moreover, it would be a new feature to add such a check if desirable, so it > would belong to another patch, it is not related to adding functions. > The addition already overflows in the current code. > > Finally I can think of good reason to use overflows deliberately, so I think > it would argue against such a change. Could you show up an example? I am curious about that. >> Those three ones are just overflowing: >> \set cid debug(9223372036854775807 + 1) >> \set cid debug(-9223372036854775808 - 1) >> \set cid debug(9223372036854775807 * 9223372036854775807) >> debug(script=0,command=1): int -9223372036854775807 >> debug(script=0,command=2): int 9223372036854775807 >> debug(script=0,command=3): int 1 > All these results are fine from my point of view. On HEAD we are getting similar strange results, so I am fine to withdraw but that's still very strange to me. The first case is generating -9223372036854775808, the second one compiles 9223372036854775807 and the third one generates 1. Or we make the error checks even more consistent in back-branches, perhaps that 's indeed not worth it for a client though. >> And this one generates a core dump: >> \set cid debug(-9223372036854775808 / -1) >> Floating point exception: 8 (core dumped) > > This one is funny, but it is a fact of int64_t life: you cannot divide > INT64_MIN by -1 because the result cannot be represented as an int64_t. > This is propably hardcoded in the processor. I do not think it is worth > doing anything about it for pgbench. This one, on the contrary, is generating an error on HEAD, and your patch is causing a crash: value "9223372036854775808" is out of range for type bigint That's a regression, no? I am uncomfortable with the fact of letting such holes in the code, even if that's a client application. >> A more general comment: what about splitting all the execution >> functions into a separate file exprexec.c? evaluateExpr (renamed as >> execExpr) is the root function, but then we have a set of static >> sub-functions for each node, like execExprFunc, execExprVar, >> execExprConst, etc? > > I do not see a strong case for renaming. The function part could be split > because of the indentation, though. The split makes sense to me regarding the fact that we have function parsing, execution and the main code paths in different files. That's not far from the backend that does similar split actually. >> This way we would save a bit of tab-indentation, this patch making the new >> code lines becoming larger than 80 characters because of all the switch/case >> stuff that gets more complicated. > > I agree that the code is pretty ugly, but this is partly due to postgres > indentation rules for switch which are *NOT* reasonnable, IMO. > > I put the function evaluation in a function in the attached version. Thanks, this makes the code a bit clearer. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

On Fri, Jan 15, 2016 at 11:53 PM, Fabien COELHOwrote: > Here is a v19 : > - avoid noisy changes > - abort on double->int overflow > - implement operators as functions > > There is still \setrandom, that I can remove easily with a green light. (I am not sure why *$%"# gmail broke the thread in my last email) Thanks for the new patch and replacing the operator stuff by functions. + uniformly-distributed random integer in [lb,ub] Nitpick: when defining an interval like that, you may want to add a space after the comma. For example seg.sgml does that. It would be good to be consistent even here. And actually you wrote [ub, lb] in two places, this should have been reversed. + /* beware that the list is reverse in make_func */ s/reverse/reversed/? } + #ifdef DEBUG Some noise. With this example: \set cid debug(sqrt(-1)) I get that: debug(script=0,command=1): double nan An error would be more logical, no? You want to emulate with complex numbers instead? The basic operator functions also do not check for integer overflows. Those three ones are just overflowing: \set cid debug(9223372036854775807 + 1) \set cid debug(-9223372036854775808 - 1) \set cid debug(9223372036854775807 * 9223372036854775807) debug(script=0,command=1): int -9223372036854775807 debug(script=0,command=2): int 9223372036854775807 debug(script=0,command=3): int 1 And this one generates a core dump: \set cid debug(-9223372036854775808 / -1) Floating point exception: 8 (core dumped) A more general comment: what about splitting all the execution functions into a separate file exprexec.c? evaluateExpr (renamed as execExpr) is the root function, but then we have a set of static sub-functions for each node, like execExprFunc, execExprVar, execExprConst, etc? This way we would save a bit of tab-indentation, this patch making the new code lines becoming larger than 80 characters because of all the switch/case stuff that gets more complicated. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

Hello Michaël, + uniformly-distributed random integer in [lb,ub] Nitpick: when defining an interval like that, you may want to add a space after the comma. Why not. + /* beware that the list is reverse in make_func */ s/reverse/reversed/? Indeed. + #ifdef DEBUG Some noise. Ok. With this example: \set cid debug(sqrt(-1)) I get that: debug(script=0,command=1): double nan An error would be more logical, no? If "sqrt(-1)" as a double is Nan for the computer, I'm fine with that. It makes the code simpler to just let the math library do its stuff and not bother. You want to emulate with complex numbers instead? Nope. The basic operator functions also do not check for integer overflows. This is a feature. I think that they should not check for overflow, as in C, this is just int64_t arithmetic "as is". Moreover, it would be a new feature to add such a check if desirable, so it would belong to another patch, it is not related to adding functions. The addition already overflows in the current code. Finally I can think of good reason to use overflows deliberately, so I think it would argue against such a change. Those three ones are just overflowing: \set cid debug(9223372036854775807 + 1) \set cid debug(-9223372036854775808 - 1) \set cid debug(9223372036854775807 * 9223372036854775807) debug(script=0,command=1): int -9223372036854775807 debug(script=0,command=2): int 9223372036854775807 debug(script=0,command=3): int 1 All these results are fine from my point of view. And this one generates a core dump: \set cid debug(-9223372036854775808 / -1) Floating point exception: 8 (core dumped) This one is funny, but it is a fact of int64_t life: you cannot divide INT64_MIN by -1 because the result cannot be represented as an int64_t. This is propably hardcoded in the processor. I do not think it is worth doing anything about it for pgbench. A more general comment: what about splitting all the execution functions into a separate file exprexec.c? evaluateExpr (renamed as execExpr) is the root function, but then we have a set of static sub-functions for each node, like execExprFunc, execExprVar, execExprConst, etc? I do not see a strong case for renaming. The function part could be split because of the indentation, though. This way we would save a bit of tab-indentation, this patch making the new code lines becoming larger than 80 characters because of all the switch/case stuff that gets more complicated. I agree that the code is pretty ugly, but this is partly due to postgres indentation rules for switch which are *NOT* reasonnable, IMO. I put the function evaluation in a function in the attached version. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 541d17b..0767b46 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -771,17 +771,21 @@ pgbench options dbname Sets variable varname to an integer value calculated from expression. The expression may contain integer constants such as 5432, - references to variables :variablename, + double constants such as 3.14159, + references to integer variables :variablename, and expressions composed of unary (-) or binary operators - (+, -, *, /, %) - with their usual associativity, and parentheses. + (+, -, *, /, + %) with their usual associativity, function calls and + parentheses. + shows the available + functions. Examples: \set ntellers 10 * :scale -\set aid (1021 * :aid) % (10 * :scale) + 1 +\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1 @@ -801,66 +805,35 @@ pgbench options dbname - By default, or when uniform is specified, all values in the - range are drawn with equal probability. Specifying gaussian - or exponential options modifies this behavior; each - requires a mandatory parameter which determines the precise shape of the - distribution. - + + + + \setrandom n 1 10 or \setrandom n 1 10 uniform + is equivalent to \set n random(1, 10) and uses a uniform + distribution. + + - - For a Gaussian distribution, the interval is mapped onto a standard - normal distribution (the classical bell-shaped Gaussian curve) truncated - at -parameter on the left and +parameter - on the right. - Values in the middle of the interval are more likely to be drawn. - To be precise, if PHI(x) is the cumulative distribution - function of the standard normal distribution, with mean mu - defined as (max + min) / 2.0, with - - f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) / -(2.0 * PHI(parameter) - 1.0) - - then value i between min and - max inclusive is drawn with probability: - f(i + 0.5) - f(i - 0.5). - Intuitively,

### Re: [HACKERS] extend pgbench expressions with functions

Hello Michaël, Here is a v19 : - avoid noisy changes - abort on double->int overflow - implement operators as functions There is still \setrandom, that I can remove easily with a green light. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 541d17b..a8bfc79 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -771,17 +771,21 @@ pgbench options dbname Sets variable varname to an integer value calculated from expression. The expression may contain integer constants such as 5432, - references to variables :variablename, + double constants such as 3.14159, + references to integer variables :variablename, and expressions composed of unary (-) or binary operators - (+, -, *, /, %) - with their usual associativity, and parentheses. + (+, -, *, /, + %) with their usual associativity, function calls and + parentheses. + shows the available + functions. Examples: \set ntellers 10 * :scale -\set aid (1021 * :aid) % (10 * :scale) + 1 +\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1 @@ -801,66 +805,35 @@ pgbench options dbname - By default, or when uniform is specified, all values in the - range are drawn with equal probability. Specifying gaussian - or exponential options modifies this behavior; each - requires a mandatory parameter which determines the precise shape of the - distribution. - + + + + \setrandom n 1 10 or \setrandom n 1 10 uniform + is equivalent to \set n random(1, 10) and uses a uniform + distribution. + + - - For a Gaussian distribution, the interval is mapped onto a standard - normal distribution (the classical bell-shaped Gaussian curve) truncated - at -parameter on the left and +parameter - on the right. - Values in the middle of the interval are more likely to be drawn. - To be precise, if PHI(x) is the cumulative distribution - function of the standard normal distribution, with mean mu - defined as (max + min) / 2.0, with - - f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) / -(2.0 * PHI(parameter) - 1.0) - - then value i between min and - max inclusive is drawn with probability: - f(i + 0.5) - f(i - 0.5). - Intuitively, the larger parameter, the more - frequently values close to the middle of the interval are drawn, and the - less frequently values close to the min and - max bounds. About 67% of values are drawn from the - middle 1.0 / parameter, that is a relative - 0.5 / parameter around the mean, and 95% in the middle - 2.0 / parameter, that is a relative - 1.0 / parameter around the mean; for instance, if - parameter is 4.0, 67% of values are drawn from the - middle quarter (1.0 / 4.0) of the interval (i.e. from - 3.0 / 8.0 to 5.0 / 8.0) and 95% from - the middle half (2.0 / 4.0) of the interval (second and - third quartiles). The minimum parameter is 2.0 for - performance of the Box-Muller transform. - + + +\setrandom n 1 10 exponential 3.0 is equivalent to +\set n random_exponential(1, 10, 3.0) and uses an +exponential distribution. + + - - For an exponential distribution, parameter - controls the distribution by truncating a quickly-decreasing - exponential distribution at parameter, and then - projecting onto integers between the bounds. - To be precise, with - -f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter)) - - Then value i between min and - max inclusive is drawn with probability: - f(x) - f(x + 1). - Intuitively, the larger parameter, the more - frequently values close to min are accessed, and the - less frequently values close to max are accessed. - The closer to 0 parameter, the flatter (more uniform) - the access distribution. - A crude approximation of the distribution is that the most frequent 1% - values in the range, close to min, are drawn - parameter% of the time. - parameter value must be strictly positive. + + +\setrandom n 1 10 gaussian 2.0 is equivalent to +\set n random_gaussian(1, 10, 2.0), and uses a gaussian +distribution. + + + + + See the documentation of these functions below for further information + about the precise shape of these distributions, depending on the value + of the parameter. @@ -940,18 +913,184 @@ f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter)) + + +PgBench Functions + + + + Function + Return Type + Description + Example +

### Re: [HACKERS] extend pgbench expressions with functions

Hello Michaël, Hmmm, this is subjective:-) And dependent on personal opinions and views. I've decided to stay with the current behavior (\setrandom), that is to abort the current transaction on errors but not to abort pgbench itself. The check is done before calling the functions, and I let an assert in the functions just to be sure. It is going to loop on errors anyway, but this is what it already does anyway. OK, yes I see now I missed that during my last review. This has the disadvantage to double the amount of code dedicated to parameter checks though :( Yep, I noticed that obviously, but I envision that "\setrandom" pretty unelegant code could go away soon, so this is really just temporary. But based on the feedback perhaps other folks would feel that it would be actually worth simply dropping the existing \setrandom command. Yep, exactly my thoughts. I did not do it because there are two ways: actually remove it and be done, or build an equivalent \set at parse time, so that would just remove the execution part, but keep some ugly stuff in parsing. I would be in favor of just dropping it. I won't object to that personally, such pgbench features are something for hackers and devs mainly, so I guess that we could survive without a deprecation period here. Yep. I can remove it, but I would like a clear go/vote before doing that. I can understand that, things like that happen all the time here and that's not a straight-forward patch that we have here. I am sure that additional opinions here would be good to have before taking one decision or another. With the current statu-quo, let's just do what you think is best. I let the operators alone and just adds functions management next to it. I'll merge operators as functions only if it is a blocker. I think that's a blocker, but I am not the only one here and not a committer. Ok, I can remove that easily anyway. - fprintf(stderr, "gaussian parameter must be at least %f (not \"%s\")\n", MIN_GAUSSIAN_PARAM, argv[5]); +fprintf(stderr, + "random gaussian parameter must be greater than %f " +"(got %f)\n", MIN_GAUSSIAN_PARAM, parameter); This looks like useless noise to me. Why changing those messages? Because the message was not saying that it was about random, and I think that it is the important. - if (parameter <= 0.0) +if (parameter < 0.0) { This bit is false, and triggers an assertion failure when the exponential parameter is 0. Oops:-( Sorry. fprintf(stderr, - "exponential parameter must be greater than zero (not \"%s\")\n", -argv[5]); + "random exponential parameter must be greater than 0.0 " + "(got %f)\n", parameter); st->ecnt++; -return true; + return false; This diff is noise as well, and should be removed. Ok, I can but "zero" and "not" back, but same remark as above, why not tell that it is about random? This information is missing. + /* +* Note: this section could be removed, as the same functionnality +* is available through \set xxx random_gaussian(...) +*/ I think that you are right to do that. That's no fun to break existing scripts, even if people doing that with pgbench are surely experienced hackers. Ok, but I would like a clear go or vote before doing this. - case ENODE_VARIABLE: Such diffs are noise as well. Yep. int() should be strengthened regarding bounds. For example: \set cid debug(int(int(9223372036854775807) + double(9223372036854775807))) debug(script=0,command=1): int -9223372036854775808 Hmmm. You mean just to check the double -> int conversion for overflow, as in: SELECT (9223372036854775807::INT8 + 9223372036854775807::DOUBLE PRECISION)::INT8; Ok. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

### Re: [HACKERS] extend pgbench expressions with functions

On Thu, Jan 14, 2016 at 5:54 PM, Fabien COELHOwrote: > I wrote: >> - fprintf(stderr, "gaussian parameter must be at least >> %f (not \"%s\")\n", MIN_GAUSSIAN_PARAM, argv[5]); >> +fprintf(stderr, >> + "random gaussian parameter must be greater >> than %f " >> +"(got %f)\n", MIN_GAUSSIAN_PARAM, parameter); >> This looks like useless noise to me. Why changing those messages? > > > Because the message was not saying that it was about random, and I think > that it is the important. >> fprintf(stderr, >> - "exponential parameter must be greater than >> zero (not \"%s\")\n", >> -argv[5]); >> + "random exponential parameter must be greater than >> 0.0 " >> + "(got %f)\n", parameter); >> st->ecnt++; >> -return true; >> + return false; >> This diff is noise as well, and should be removed. > > Ok, I can but "zero" and "not" back, but same remark as above, why not tell > that it is about random? This information is missing. Those things should be a separate patch then, committed separately as they provide more verbose messages. >> + /* >> +* Note: this section could be removed, as the same >> functionnality >> +* is available through \set xxx random_gaussian(...) >> +*/ >> I think that you are right to do that. That's no fun to break existing >> scripts, even if people doing that with pgbench are surely experienced >> hackers. > > Ok, but I would like a clear go or vote before doing this. For now, I am seeing opinions on those matters from nobody else than me and you, and we got toward the same direction. If you think that there is a possibility that somebody has a different opinion on those matters, and it is likely so let's keep the patch as-is then and wait for more input: it is easier to remove code than add it back. I am not sure what a committer would say, and it surely would be a waste of time to just move back and worth for everybody. >> int() should be strengthened regarding bounds. For example: >> \set cid debug(int(int(9223372036854775807) + >> double(9223372036854775807))) >> debug(script=0,command=1): int -9223372036854775808 > > > Hmmm. You mean just to check the double -> int conversion for overflow, > as in: > > SELECT (9223372036854775807::INT8 + > 9223372036854775807::DOUBLE PRECISION)::INT8; > > Ok. Yes, that's what I mean. The job running into that should definitely fail with a proper out-of-bound error message. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers