subject:"Re\: \[HACKERS\] extend pgbench expressions with functions"

Re: [HACKERS] extend pgbench expressions with functions

2016-03-29 Thread Michael Paquier

On Wed, Mar 30, 2016 at 1:23 AM, Fabien COELHO  wrote:
>
> Hello Robert,
>
>>> If we don't nuke it, it'll never die.
>>
>>
>> Hearing no objections, BOOM.
>
>
> FIZZ! :-)
>
> Thanks for the commits, and apology for the portability bugs.

Thanks for the additions, Fabien. This has been a long trip.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-03-29 Thread Fabien COELHO



Hello Robert,


If we don't nuke it, it'll never die.


Hearing no objections, BOOM.


FIZZ! :-)

Thanks for the commits, and apology for the portability bugs.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-03-29 Thread Robert Haas

On Mon, Mar 28, 2016 at 9:36 PM, Stephen Frost  wrote:
> * Michael Paquier (michael.paqu...@gmail.com) wrote:
>> On Tue, Mar 29, 2016 at 9:54 AM, Robert Haas  wrote:
>> > On Sun, Mar 27, 2016 at 5:01 AM, Fabien COELHO  wrote:
>> >> v40 is yet another rebase.
>> >
>> > Thanks.  Committed after removing an unnecessary parameter from the
>> > coerceToXXX() functions.
>> >
>> > I guess the question here is whether we want the part-c patch, which
>> > removes the historical \setrandom syntax.  That's technically no
>> > longer needed since we now can use random() as a function directly
>> > inside \set commands, but we might want it anyway for backward
>> > compatibility.
>> >
>> > Anybody have an opinion on that?
>>
>> +1 for nuking it. That's not worth the trouble maintaining it.
>
> If we don't nuke it, it'll never die.
>
> See also: pg_shadow

Hearing no objections, BOOM.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-03-28 Thread Fabien COELHO




I guess the question here is whether we want the part-c patch, which
removes the historical \setrandom syntax.  That's technically no
longer needed since we now can use random() as a function directly
inside \set commands, but we might want it anyway for backward
compatibility.


This patch is indeed a proposal.


Anybody have an opinion on that?


+1 for nuking it. That's not worth the trouble maintaining it.


I share Michaël opinion.

Some arguments for removing it:

 - \setrandom is syntactically inhomogeneous in the overall syntax,
   and is now redundant

 - switching to the \set syntax is pretty easy, see attached script

 - custom scripts are short, they are used by few but
   advance users, for which updating would not be an issue

 - the parsing & execution codes are lengthy, repetitive...

--
Fabien.#! /usr/bin/perl -wp
s/^\\setrandom\s+(\S+)\s+(\S+)\s+(\S+)\s+(exponential|gaussian)\s+(\S+)/\\set $1 random_$4($2, $3, $5)/;
s/^\\setrandom\s+(\S+)\s+(\S+)\s+(\S+)(\s+uniform)?/\\set $1 random($2, $3)/;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-03-28 Thread Stephen Frost

* Michael Paquier (michael.paqu...@gmail.com) wrote:
> On Tue, Mar 29, 2016 at 9:54 AM, Robert Haas  wrote:
> > On Sun, Mar 27, 2016 at 5:01 AM, Fabien COELHO  wrote:
> >> v40 is yet another rebase.
> >
> > Thanks.  Committed after removing an unnecessary parameter from the
> > coerceToXXX() functions.
> >
> > I guess the question here is whether we want the part-c patch, which
> > removes the historical \setrandom syntax.  That's technically no
> > longer needed since we now can use random() as a function directly
> > inside \set commands, but we might want it anyway for backward
> > compatibility.
> >
> > Anybody have an opinion on that?
> 
> +1 for nuking it. That's not worth the trouble maintaining it.

If we don't nuke it, it'll never die.

See also: pg_shadow

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] extend pgbench expressions with functions

2016-03-28 Thread Michael Paquier

On Tue, Mar 29, 2016 at 9:54 AM, Robert Haas  wrote:
> On Sun, Mar 27, 2016 at 5:01 AM, Fabien COELHO  wrote:
>> v40 is yet another rebase.
>
> Thanks.  Committed after removing an unnecessary parameter from the
> coerceToXXX() functions.
>
> I guess the question here is whether we want the part-c patch, which
> removes the historical \setrandom syntax.  That's technically no
> longer needed since we now can use random() as a function directly
> inside \set commands, but we might want it anyway for backward
> compatibility.
>
> Anybody have an opinion on that?

+1 for nuking it. That's not worth the trouble maintaining it.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-03-28 Thread Robert Haas

On Sun, Mar 27, 2016 at 5:01 AM, Fabien COELHO  wrote:
> v40 is yet another rebase.

Thanks.  Committed after removing an unnecessary parameter from the
coerceToXXX() functions.

I guess the question here is whether we want the part-c patch, which
removes the historical \setrandom syntax.  That's technically no
longer needed since we now can use random() as a function directly
inside \set commands, but we might want it anyway for backward
compatibility.

Anybody have an opinion on that?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-03-27 Thread Fabien COELHO



v40 is yet another rebase.

--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index c6d1454..4ceddae 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -815,9 +815,10 @@ pgbench  options  dbname
 
 
  
-  Sets variable varname to an integer value calculated
+  Sets variable varname to a value calculated
   from expression.
   The expression may contain integer constants such as 5432,
+  double constants such as 3.14159,
   references to variables :variablename,
   unary operators (+, -) and binary operators
   (+, -, *, /,
@@ -830,7 +831,7 @@ pgbench  options  dbname
   Examples:
 
 \set ntellers 10 * :scale
-\set aid (1021 * :aid) % (10 * :scale) + 1
+\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1
 
 

@@ -850,66 +851,35 @@ pgbench  options  dbname
  
 
  
-  By default, or when uniform is specified, all values in the
-  range are drawn with equal probability.  Specifying gaussian
-  or  exponential options modifies this behavior; each
-  requires a mandatory parameter which determines the precise shape of the
-  distribution.
- 
+  
+   
+
+ \setrandom n 1 10 or \setrandom n 1 10 uniform
+ is equivalent to \set n random(1, 10) and uses a uniform
+ distribution.
+
+   
 
- 
-  For a Gaussian distribution, the interval is mapped onto a standard
-  normal distribution (the classical bell-shaped Gaussian curve) truncated
-  at -parameter on the left and +parameter
-  on the right.
-  Values in the middle of the interval are more likely to be drawn.
-  To be precise, if PHI(x) is the cumulative distribution
-  function of the standard normal distribution, with mean mu
-  defined as (max + min) / 2.0, with
-
- f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
-(2.0 * PHI(parameter) - 1.0)
-
-  then value i between min and
-  max inclusive is drawn with probability:
-  f(i + 0.5) - f(i - 0.5).
-  Intuitively, the larger parameter, the more
-  frequently values close to the middle of the interval are drawn, and the
-  less frequently values close to the min and
-  max bounds. About 67% of values are drawn from the
-  middle 1.0 / parameter, that is a relative
-  0.5 / parameter around the mean, and 95% in the middle
-  2.0 / parameter, that is a relative
-  1.0 / parameter around the mean; for instance, if
-  parameter is 4.0, 67% of values are drawn from the
-  middle quarter (1.0 / 4.0) of the interval (i.e. from
-  3.0 / 8.0 to 5.0 / 8.0) and 95% from
-  the middle half (2.0 / 4.0) of the interval (second and
-  third quartiles). The minimum parameter is 2.0 for
-  performance of the Box-Muller transform.
- 
+  
+   
+\setrandom n 1 10 exponential 3.0 is equivalent to
+\set n random_exponential(1, 10, 3.0) and uses an
+exponential distribution.
+   
+  
 
- 
-  For an exponential distribution, parameter
-  controls the distribution by truncating a quickly-decreasing
-  exponential distribution at parameter, and then
-  projecting onto integers between the bounds.
-  To be precise, with
-
-f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter))
-
-  Then value i between min and
-  max inclusive is drawn with probability:
-  f(x) - f(x + 1).
-  Intuitively, the larger parameter, the more
-  frequently values close to min are accessed, and the
-  less frequently values close to max are accessed.
-  The closer to 0 parameter, the flatter (more uniform)
-  the access distribution.
-  A crude approximation of the distribution is that the most frequent 1%
-  values in the range, close to min, are drawn
-  parameter% of the time.
-  parameter value must be strictly positive.
+  
+   
+\setrandom n 1 10 gaussian 2.0 is equivalent to
+\set n random_gaussian(1, 10, 2.0), and uses a gaussian
+distribution.
+   
+  
+ 
+
+   See the documentation of these functions below for further information
+   about the precise shape of these distributions, depending on the value
+   of the parameter.
  
 
  
@@ -990,34 +960,6 @@ f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter))
 

   
-
-  
-   As an example, the full definition of the built-in TPC-B-like
-   transaction is:
-
-
-\set nbranches :scale
-\set ntellers 10 * :scale
-\set naccounts 10 * :scale
-\setrandom aid 1 :naccounts
-\setrandom bid 1 :nbranches
-\setrandom tid 1 :ntellers
-\setrandom delta -5000 5000
-BEGIN;
-UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;
-SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
-UPDATE pgbench_tellers SET tbalance = tbal

Re: [HACKERS] extend pgbench expressions with functions

2016-03-21 Thread Fabien COELHO



Bonjour Michaël,


v39 is yet another rebase: 42 is in sight!


What's up with v42? Is that your personal record?


It is just a (geek) joke, see:

https://en.wikipedia.org/wiki/42_%28number%29#Hitchhiker.27s_Guide_to_the_Galaxy

It is the answer to the "Ultimate Question of Life, The Universe and 
Everything", computed by the supercomputer "Deep thought" in 7.5 million 
years.


--
Fabien.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-03-20 Thread Michael Paquier

On Mon, Mar 21, 2016 at 6:56 AM, Fabien COELHO  wrote:
>
>> v38 is a simple rebase, trying to keep up-to-date with Tom's work.
>
>
> v39 is yet another rebase: 42 is in sight!

What's up with v42? Is that your personal record?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-03-20 Thread Fabien COELHO




v38 is a simple rebase, trying to keep up-to-date with Tom's work.


v39 is yet another rebase: 42 is in sight!

--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index c6d1454..4ceddae 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -815,9 +815,10 @@ pgbench  options  dbname
 
 
  
-  Sets variable varname to an integer value calculated
+  Sets variable varname to a value calculated
   from expression.
   The expression may contain integer constants such as 5432,
+  double constants such as 3.14159,
   references to variables :variablename,
   unary operators (+, -) and binary operators
   (+, -, *, /,
@@ -830,7 +831,7 @@ pgbench  options  dbname
   Examples:
 
 \set ntellers 10 * :scale
-\set aid (1021 * :aid) % (10 * :scale) + 1
+\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1
 
 

@@ -850,66 +851,35 @@ pgbench  options  dbname
  
 
  
-  By default, or when uniform is specified, all values in the
-  range are drawn with equal probability.  Specifying gaussian
-  or  exponential options modifies this behavior; each
-  requires a mandatory parameter which determines the precise shape of the
-  distribution.
- 
+  
+   
+
+ \setrandom n 1 10 or \setrandom n 1 10 uniform
+ is equivalent to \set n random(1, 10) and uses a uniform
+ distribution.
+
+   
 
- 
-  For a Gaussian distribution, the interval is mapped onto a standard
-  normal distribution (the classical bell-shaped Gaussian curve) truncated
-  at -parameter on the left and +parameter
-  on the right.
-  Values in the middle of the interval are more likely to be drawn.
-  To be precise, if PHI(x) is the cumulative distribution
-  function of the standard normal distribution, with mean mu
-  defined as (max + min) / 2.0, with
-
- f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
-(2.0 * PHI(parameter) - 1.0)
-
-  then value i between min and
-  max inclusive is drawn with probability:
-  f(i + 0.5) - f(i - 0.5).
-  Intuitively, the larger parameter, the more
-  frequently values close to the middle of the interval are drawn, and the
-  less frequently values close to the min and
-  max bounds. About 67% of values are drawn from the
-  middle 1.0 / parameter, that is a relative
-  0.5 / parameter around the mean, and 95% in the middle
-  2.0 / parameter, that is a relative
-  1.0 / parameter around the mean; for instance, if
-  parameter is 4.0, 67% of values are drawn from the
-  middle quarter (1.0 / 4.0) of the interval (i.e. from
-  3.0 / 8.0 to 5.0 / 8.0) and 95% from
-  the middle half (2.0 / 4.0) of the interval (second and
-  third quartiles). The minimum parameter is 2.0 for
-  performance of the Box-Muller transform.
- 
+  
+   
+\setrandom n 1 10 exponential 3.0 is equivalent to
+\set n random_exponential(1, 10, 3.0) and uses an
+exponential distribution.
+   
+  
 
- 
-  For an exponential distribution, parameter
-  controls the distribution by truncating a quickly-decreasing
-  exponential distribution at parameter, and then
-  projecting onto integers between the bounds.
-  To be precise, with
-
-f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter))
-
-  Then value i between min and
-  max inclusive is drawn with probability:
-  f(x) - f(x + 1).
-  Intuitively, the larger parameter, the more
-  frequently values close to min are accessed, and the
-  less frequently values close to max are accessed.
-  The closer to 0 parameter, the flatter (more uniform)
-  the access distribution.
-  A crude approximation of the distribution is that the most frequent 1%
-  values in the range, close to min, are drawn
-  parameter% of the time.
-  parameter value must be strictly positive.
+  
+   
+\setrandom n 1 10 gaussian 2.0 is equivalent to
+\set n random_gaussian(1, 10, 2.0), and uses a gaussian
+distribution.
+   
+  
+ 
+
+   See the documentation of these functions below for further information
+   about the precise shape of these distributions, depending on the value
+   of the parameter.
  
 
  
@@ -990,34 +960,6 @@ f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter))
 

   
-
-  
-   As an example, the full definition of the built-in TPC-B-like
-   transaction is:
-
-
-\set nbranches :scale
-\set ntellers 10 * :scale
-\set naccounts 10 * :scale
-\setrandom aid 1 :naccounts
-\setrandom bid 1 :nbranches
-\setrandom tid 1 :ntellers
-\setrandom delta -5000 5000
-BEGIN;
-UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;
-SELECT abalan

Re: [HACKERS] extend pgbench expressions with functions

2016-03-19 Thread Fabien COELHO



v38 is a simple rebase, trying to keep up-to-date with Tom's work.

--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index dd3fb1d..cf9c1cd 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -802,9 +802,10 @@ pgbench  options  dbname
 
 
  
-  Sets variable varname to an integer value calculated
+  Sets variable varname to a value calculated
   from expression.
   The expression may contain integer constants such as 5432,
+  double constants such as 3.14159,
   references to variables :variablename,
   and expressions composed of unary (-) or binary operators
   (+, -, *, /,
@@ -817,7 +818,7 @@ pgbench  options  dbname
   Examples:
 
 \set ntellers 10 * :scale
-\set aid (1021 * :aid) % (10 * :scale) + 1
+\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1
 
 

@@ -837,66 +838,35 @@ pgbench  options  dbname
  
 
  
-  By default, or when uniform is specified, all values in the
-  range are drawn with equal probability.  Specifying gaussian
-  or  exponential options modifies this behavior; each
-  requires a mandatory parameter which determines the precise shape of the
-  distribution.
- 
+  
+   
+
+ \setrandom n 1 10 or \setrandom n 1 10 uniform
+ is equivalent to \set n random(1, 10) and uses a uniform
+ distribution.
+
+   
 
- 
-  For a Gaussian distribution, the interval is mapped onto a standard
-  normal distribution (the classical bell-shaped Gaussian curve) truncated
-  at -parameter on the left and +parameter
-  on the right.
-  Values in the middle of the interval are more likely to be drawn.
-  To be precise, if PHI(x) is the cumulative distribution
-  function of the standard normal distribution, with mean mu
-  defined as (max + min) / 2.0, with
-
- f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
-(2.0 * PHI(parameter) - 1.0)
-
-  then value i between min and
-  max inclusive is drawn with probability:
-  f(i + 0.5) - f(i - 0.5).
-  Intuitively, the larger parameter, the more
-  frequently values close to the middle of the interval are drawn, and the
-  less frequently values close to the min and
-  max bounds. About 67% of values are drawn from the
-  middle 1.0 / parameter, that is a relative
-  0.5 / parameter around the mean, and 95% in the middle
-  2.0 / parameter, that is a relative
-  1.0 / parameter around the mean; for instance, if
-  parameter is 4.0, 67% of values are drawn from the
-  middle quarter (1.0 / 4.0) of the interval (i.e. from
-  3.0 / 8.0 to 5.0 / 8.0) and 95% from
-  the middle half (2.0 / 4.0) of the interval (second and
-  third quartiles). The minimum parameter is 2.0 for
-  performance of the Box-Muller transform.
- 
+  
+   
+\setrandom n 1 10 exponential 3.0 is equivalent to
+\set n random_exponential(1, 10, 3.0) and uses an
+exponential distribution.
+   
+  
 
- 
-  For an exponential distribution, parameter
-  controls the distribution by truncating a quickly-decreasing
-  exponential distribution at parameter, and then
-  projecting onto integers between the bounds.
-  To be precise, with
-
-f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter))
-
-  Then value i between min and
-  max inclusive is drawn with probability:
-  f(x) - f(x + 1).
-  Intuitively, the larger parameter, the more
-  frequently values close to min are accessed, and the
-  less frequently values close to max are accessed.
-  The closer to 0 parameter, the flatter (more uniform)
-  the access distribution.
-  A crude approximation of the distribution is that the most frequent 1%
-  values in the range, close to min, are drawn
-  parameter% of the time.
-  parameter value must be strictly positive.
+  
+   
+\setrandom n 1 10 gaussian 2.0 is equivalent to
+\set n random_gaussian(1, 10, 2.0), and uses a gaussian
+distribution.
+   
+  
+ 
+
+   See the documentation of these functions below for further information
+   about the precise shape of these distributions, depending on the value
+   of the parameter.
  
 
  
@@ -975,34 +945,6 @@ f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter))
 

   
-
-  
-   As an example, the full definition of the built-in TPC-B-like
-   transaction is:
-
-
-\set nbranches :scale
-\set ntellers 10 * :scale
-\set naccounts 10 * :scale
-\setrandom aid 1 :naccounts
-\setrandom bid 1 :nbranches
-\setrandom tid 1 :ntellers
-\setrandom delta -5000 5000
-BEGIN;
-UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;
-SELECT abalance FROM pgbench_accounts WHERE a

Re: [HACKERS] extend pgbench expressions with functions

2016-03-19 Thread Fabien COELHO



Here is a v36 which inspect very carefully the string to decide whether it is 
an int or a double. You may, or may not, find it to your taste, I can't say.


Here is a v37 which is mostly a rebase after recent changes. Also I 
noticed that I was double counting errors in the previous version, so I 
fixed it.


--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index dd3fb1d..cf9c1cd 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -802,9 +802,10 @@ pgbench  options  dbname
 
 
  
-  Sets variable varname to an integer value calculated
+  Sets variable varname to a value calculated
   from expression.
   The expression may contain integer constants such as 5432,
+  double constants such as 3.14159,
   references to variables :variablename,
   and expressions composed of unary (-) or binary operators
   (+, -, *, /,
@@ -817,7 +818,7 @@ pgbench  options  dbname
   Examples:
 
 \set ntellers 10 * :scale
-\set aid (1021 * :aid) % (10 * :scale) + 1
+\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1
 
 

@@ -837,66 +838,35 @@ pgbench  options  dbname
  
 
  
-  By default, or when uniform is specified, all values in the
-  range are drawn with equal probability.  Specifying gaussian
-  or  exponential options modifies this behavior; each
-  requires a mandatory parameter which determines the precise shape of the
-  distribution.
- 
+  
+   
+
+ \setrandom n 1 10 or \setrandom n 1 10 uniform
+ is equivalent to \set n random(1, 10) and uses a uniform
+ distribution.
+
+   
 
- 
-  For a Gaussian distribution, the interval is mapped onto a standard
-  normal distribution (the classical bell-shaped Gaussian curve) truncated
-  at -parameter on the left and +parameter
-  on the right.
-  Values in the middle of the interval are more likely to be drawn.
-  To be precise, if PHI(x) is the cumulative distribution
-  function of the standard normal distribution, with mean mu
-  defined as (max + min) / 2.0, with
-
- f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
-(2.0 * PHI(parameter) - 1.0)
-
-  then value i between min and
-  max inclusive is drawn with probability:
-  f(i + 0.5) - f(i - 0.5).
-  Intuitively, the larger parameter, the more
-  frequently values close to the middle of the interval are drawn, and the
-  less frequently values close to the min and
-  max bounds. About 67% of values are drawn from the
-  middle 1.0 / parameter, that is a relative
-  0.5 / parameter around the mean, and 95% in the middle
-  2.0 / parameter, that is a relative
-  1.0 / parameter around the mean; for instance, if
-  parameter is 4.0, 67% of values are drawn from the
-  middle quarter (1.0 / 4.0) of the interval (i.e. from
-  3.0 / 8.0 to 5.0 / 8.0) and 95% from
-  the middle half (2.0 / 4.0) of the interval (second and
-  third quartiles). The minimum parameter is 2.0 for
-  performance of the Box-Muller transform.
- 
+  
+   
+\setrandom n 1 10 exponential 3.0 is equivalent to
+\set n random_exponential(1, 10, 3.0) and uses an
+exponential distribution.
+   
+  
 
- 
-  For an exponential distribution, parameter
-  controls the distribution by truncating a quickly-decreasing
-  exponential distribution at parameter, and then
-  projecting onto integers between the bounds.
-  To be precise, with
-
-f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter))
-
-  Then value i between min and
-  max inclusive is drawn with probability:
-  f(x) - f(x + 1).
-  Intuitively, the larger parameter, the more
-  frequently values close to min are accessed, and the
-  less frequently values close to max are accessed.
-  The closer to 0 parameter, the flatter (more uniform)
-  the access distribution.
-  A crude approximation of the distribution is that the most frequent 1%
-  values in the range, close to min, are drawn
-  parameter% of the time.
-  parameter value must be strictly positive.
+  
+   
+\setrandom n 1 10 gaussian 2.0 is equivalent to
+\set n random_gaussian(1, 10, 2.0), and uses a gaussian
+distribution.
+   
+  
+ 
+
+   See the documentation of these functions below for further information
+   about the precise shape of these distributions, depending on the value
+   of the parameter.
  
 
  
@@ -975,34 +945,6 @@ f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter))
 

   
-
-  
-   As an example, the full definition of the built-in TPC-B-like
-   transaction is:
-
-
-\set nbranches :scale
-\set ntellers 10 * :scale
-\set naccounts 10 * :scale
-

Re: [HACKERS] extend pgbench expressions with functions

2016-03-09 Thread Fabien COELHO



Hello Robert,

[...] With your patch, you get different behavior depending on exactly 
how the input is malformed.


I understand that you require only one possible error message on malformed 
input, instead of failing when converting to double if the input looked 
like a double (there was a '.' clue int it, but that is not proof), or 
something else if it was assumed to be an int.


So I'm going to assume that you do not like the type guessing.

I'm not going to commit it this way, and frankly, neither is anyone 
else.


Hmmm. Probably my unconcious self is trying to reach 42.

Here is a v36 which inspect very carefully the string to decide whether 
it is an int or a double. You may, or may not, find it to your taste, I 
can't say.


--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index cc80b3f..2133bf7 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -794,9 +794,10 @@ pgbench  options  dbname
 
 
  
-  Sets variable varname to an integer value calculated
+  Sets variable varname to a value calculated
   from expression.
   The expression may contain integer constants such as 5432,
+  double constants such as 3.14159,
   references to variables :variablename,
   and expressions composed of unary (-) or binary operators
   (+, -, *, /,
@@ -809,7 +810,7 @@ pgbench  options  dbname
   Examples:
 
 \set ntellers 10 * :scale
-\set aid (1021 * :aid) % (10 * :scale) + 1
+\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1
 
 

@@ -829,66 +830,35 @@ pgbench  options  dbname
  
 
  
-  By default, or when uniform is specified, all values in the
-  range are drawn with equal probability.  Specifying gaussian
-  or  exponential options modifies this behavior; each
-  requires a mandatory parameter which determines the precise shape of the
-  distribution.
- 
+  
+   
+
+ \setrandom n 1 10 or \setrandom n 1 10 uniform
+ is equivalent to \set n random(1, 10) and uses a uniform
+ distribution.
+
+   
 
- 
-  For a Gaussian distribution, the interval is mapped onto a standard
-  normal distribution (the classical bell-shaped Gaussian curve) truncated
-  at -parameter on the left and +parameter
-  on the right.
-  Values in the middle of the interval are more likely to be drawn.
-  To be precise, if PHI(x) is the cumulative distribution
-  function of the standard normal distribution, with mean mu
-  defined as (max + min) / 2.0, with
-
- f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
-(2.0 * PHI(parameter) - 1.0)
-
-  then value i between min and
-  max inclusive is drawn with probability:
-  f(i + 0.5) - f(i - 0.5).
-  Intuitively, the larger parameter, the more
-  frequently values close to the middle of the interval are drawn, and the
-  less frequently values close to the min and
-  max bounds. About 67% of values are drawn from the
-  middle 1.0 / parameter, that is a relative
-  0.5 / parameter around the mean, and 95% in the middle
-  2.0 / parameter, that is a relative
-  1.0 / parameter around the mean; for instance, if
-  parameter is 4.0, 67% of values are drawn from the
-  middle quarter (1.0 / 4.0) of the interval (i.e. from
-  3.0 / 8.0 to 5.0 / 8.0) and 95% from
-  the middle half (2.0 / 4.0) of the interval (second and
-  third quartiles). The minimum parameter is 2.0 for
-  performance of the Box-Muller transform.
- 
+  
+   
+\setrandom n 1 10 exponential 3.0 is equivalent to
+\set n random_exponential(1, 10, 3.0) and uses an
+exponential distribution.
+   
+  
 
- 
-  For an exponential distribution, parameter
-  controls the distribution by truncating a quickly-decreasing
-  exponential distribution at parameter, and then
-  projecting onto integers between the bounds.
-  To be precise, with
-
-f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter))
-
-  Then value i between min and
-  max inclusive is drawn with probability:
-  f(x) - f(x + 1).
-  Intuitively, the larger parameter, the more
-  frequently values close to min are accessed, and the
-  less frequently values close to max are accessed.
-  The closer to 0 parameter, the flatter (more uniform)
-  the access distribution.
-  A crude approximation of the distribution is that the most frequent 1%
-  values in the range, close to min, are drawn
-  parameter% of the time.
-  parameter value must be strictly positive.
+  
+   
+\setrandom n 1 10 gaussian 2.0 is equivalent to
+\set n random_gaussian(1, 10, 2.0), and uses a gaussian
+distribution.
+   
+  
+ 
+
+   See the documentation of these functions below for fu

Re: [HACKERS] extend pgbench expressions with functions

2016-03-09 Thread Robert Haas

On Wed, Mar 9, 2016 at 3:09 AM, Fabien COELHO  wrote:
> I'm not sure what is "not acceptable" as it "totally breaks the error
> handling" in the above code.
>
> I assumed that you want to check that sscanf can read what sprintf generated
> when handling "\set". I'd guess that libc would be broken if it was the
> case. I've made pgbench check that it is not.

If the variable contains something that is not an integer, the
existing code will end up here:

fprintf(stderr, "invalid input syntax for integer: \"%s\"\n", str);

With your patch, you get different behavior depending on exactly how
the input is malformed.  I have a strong suspicion you're going to
tell me that this is another one of those cases where it's OK to make
the error handling worse than it is today, but I'm tired of arguing
that point.  I'm not going to commit it this way, and frankly, neither
is anyone else.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-03-09 Thread Fabien COELHO



Hello Robert,

Here is a v35 b & c.


This is not acceptable:

+   /* guess double type (n for "inf",
"-inf" and "nan") */
+   if (strchr(var, '.') != NULL ||
strchr(var, 'n') != NULL)
+   {
+   double dv;
+   sscanf(var, "%lf", &dv);
+   setDoubleValue(retval, dv);
+   }
+   else
+   setIntValue(retval, strtoint64(var));

That totally breaks the error handling which someone carefully established here.


I'm not sure what is "not acceptable" as it "totally breaks the error 
handling" in the above code.


I assumed that you want to check that sscanf can read what sprintf 
generated when handling "\set". I'd guess that libc would be broken if it 
was the case. I've made pgbench check that it is not.


If it was something else, you have to spell it out for me.


Extra space. [...] Another extra space.


Indeed.


-   int nargs = 0;
-   int64   iargs[MAX_FARGS];
-   PgBenchExprLink *l = args;
+   int nargs = 0;
+   PgBenchValuevargs[MAX_FARGS];
+   PgBenchExprLink*l = args;

Completely unnecessary reindentation of the first and third lines.


It just looked better to me.


+   setIntValue(retval, i < 0? -i: i);

Not project style.


Indeed.


Please fix the whitespace damage git diff --check complains about,


"git diff -check" does not seem to complain on the full v35-b.


and check for other places where you haven't followed project style.


I've found some more instances of "& ref".

--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index cc80b3f..2133bf7 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -794,9 +794,10 @@ pgbench  options  dbname
 
 
  
-  Sets variable varname to an integer value calculated
+  Sets variable varname to a value calculated
   from expression.
   The expression may contain integer constants such as 5432,
+  double constants such as 3.14159,
   references to variables :variablename,
   and expressions composed of unary (-) or binary operators
   (+, -, *, /,
@@ -809,7 +810,7 @@ pgbench  options  dbname
   Examples:
 
 \set ntellers 10 * :scale
-\set aid (1021 * :aid) % (10 * :scale) + 1
+\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1
 
 

@@ -829,66 +830,35 @@ pgbench  options  dbname
  
 
  
-  By default, or when uniform is specified, all values in the
-  range are drawn with equal probability.  Specifying gaussian
-  or  exponential options modifies this behavior; each
-  requires a mandatory parameter which determines the precise shape of the
-  distribution.
- 
+  
+   
+
+ \setrandom n 1 10 or \setrandom n 1 10 uniform
+ is equivalent to \set n random(1, 10) and uses a uniform
+ distribution.
+
+   
 
- 
-  For a Gaussian distribution, the interval is mapped onto a standard
-  normal distribution (the classical bell-shaped Gaussian curve) truncated
-  at -parameter on the left and +parameter
-  on the right.
-  Values in the middle of the interval are more likely to be drawn.
-  To be precise, if PHI(x) is the cumulative distribution
-  function of the standard normal distribution, with mean mu
-  defined as (max + min) / 2.0, with
-
- f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
-(2.0 * PHI(parameter) - 1.0)
-
-  then value i between min and
-  max inclusive is drawn with probability:
-  f(i + 0.5) - f(i - 0.5).
-  Intuitively, the larger parameter, the more
-  frequently values close to the middle of the interval are drawn, and the
-  less frequently values close to the min and
-  max bounds. About 67% of values are drawn from the
-  middle 1.0 / parameter, that is a relative
-  0.5 / parameter around the mean, and 95% in the middle
-  2.0 / parameter, that is a relative
-  1.0 / parameter around the mean; for instance, if
-  parameter is 4.0, 67% of values are drawn from the
-  middle quarter (1.0 / 4.0) of the interval (i.e. from
-  3.0 / 8.0 to 5.0 / 8.0) and 95% from
-  the middle half (2.0 / 4.0) of the interval (second and
-  third quartiles). The minimum parameter is 2.0 for
-  performance of the Box-Muller transform.
- 
+  
+   
+\setrandom n 1 10 exponential 3.0 is equivalent to
+\set n random_exponential(1, 10, 3.0) and uses an
+exponential distribution.
+   
+  
 
- 
-  For an exponential distribution, parameter
-  controls the distributi

Re: [HACKERS] extend pgbench expressions with functions

2016-03-08 Thread Robert Haas

On Tue, Mar 8, 2016 at 3:52 PM, Fabien COELHO  wrote:
> [ new patch and assorted color commentary ]

This is not acceptable:

+   /* guess double type (n for "inf",
"-inf" and "nan") */
+   if (strchr(var, '.') != NULL ||
strchr(var, 'n') != NULL)
+   {
+   double dv;
+   sscanf(var, "%lf", &dv);
+   setDoubleValue(retval, dv);
+   }
+   else
+   setIntValue(retval, strtoint64(var));

That totally breaks the error handling which someone carefully established here.

+   PgBenchValue *varg = & vargs[0];

Extra space.

+   if (!coerceToDouble(st, & vargs[0], &dval))
+   return false;

Another extra space.

-   int nargs = 0;
-   int64   iargs[MAX_FARGS];
-   PgBenchExprLink *l = args;
+   int nargs = 0;
+   PgBenchValuevargs[MAX_FARGS];
+   PgBenchExprLink*l = args;

Completely unnecessary reindentation of the first and third lines.

+   setIntValue(retval, i < 0? -i: i);

Not project style.

Please fix the whitespace damage git diff --check complains about, and
check for other places where you haven't followed project style.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-03-08 Thread Fabien COELHO



Hello Robert.

Here is a v34 b & c.


// comments are not allowed.  I'd just remove the two you have.


Back to the eighties!


It make no sense to exit(1) and then return 0, so don't do that.  I
might write this code as:
This would get rid of the internal-error case here altogether in favor
of testing it via an assertion.


I've put assertions instead of exit in some places.


I think that coerceToInt() should not exit(1) when an overflow occurs;


I think that it should, because the only sane option for the user is to 
fix the script and relaunch the bench: counting errors has no added value 
for the user.


The attached version does some error handling instead, too bad.


Now, if rval is out of range of an integer, that is going to overflow
while trying to see whether it should divide by zero.


I could not find a place where there where such potential issue. If the 
value is zero, it cannot overflow when cast to int. If it is not zero but 
it overflows, then it is an overflow, so it should overflow. Maybe I 
misunderstood your point.


--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index cc80b3f..2133bf7 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -794,9 +794,10 @@ pgbench  options  dbname
 
 
  
-  Sets variable varname to an integer value calculated
+  Sets variable varname to a value calculated
   from expression.
   The expression may contain integer constants such as 5432,
+  double constants such as 3.14159,
   references to variables :variablename,
   and expressions composed of unary (-) or binary operators
   (+, -, *, /,
@@ -809,7 +810,7 @@ pgbench  options  dbname
   Examples:
 
 \set ntellers 10 * :scale
-\set aid (1021 * :aid) % (10 * :scale) + 1
+\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1
 
 

@@ -829,66 +830,35 @@ pgbench  options  dbname
  
 
  
-  By default, or when uniform is specified, all values in the
-  range are drawn with equal probability.  Specifying gaussian
-  or  exponential options modifies this behavior; each
-  requires a mandatory parameter which determines the precise shape of the
-  distribution.
- 
+  
+   
+
+ \setrandom n 1 10 or \setrandom n 1 10 uniform
+ is equivalent to \set n random(1, 10) and uses a uniform
+ distribution.
+
+   
 
- 
-  For a Gaussian distribution, the interval is mapped onto a standard
-  normal distribution (the classical bell-shaped Gaussian curve) truncated
-  at -parameter on the left and +parameter
-  on the right.
-  Values in the middle of the interval are more likely to be drawn.
-  To be precise, if PHI(x) is the cumulative distribution
-  function of the standard normal distribution, with mean mu
-  defined as (max + min) / 2.0, with
-
- f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
-(2.0 * PHI(parameter) - 1.0)
-
-  then value i between min and
-  max inclusive is drawn with probability:
-  f(i + 0.5) - f(i - 0.5).
-  Intuitively, the larger parameter, the more
-  frequently values close to the middle of the interval are drawn, and the
-  less frequently values close to the min and
-  max bounds. About 67% of values are drawn from the
-  middle 1.0 / parameter, that is a relative
-  0.5 / parameter around the mean, and 95% in the middle
-  2.0 / parameter, that is a relative
-  1.0 / parameter around the mean; for instance, if
-  parameter is 4.0, 67% of values are drawn from the
-  middle quarter (1.0 / 4.0) of the interval (i.e. from
-  3.0 / 8.0 to 5.0 / 8.0) and 95% from
-  the middle half (2.0 / 4.0) of the interval (second and
-  third quartiles). The minimum parameter is 2.0 for
-  performance of the Box-Muller transform.
- 
+  
+   
+\setrandom n 1 10 exponential 3.0 is equivalent to
+\set n random_exponential(1, 10, 3.0) and uses an
+exponential distribution.
+   
+  
 
- 
-  For an exponential distribution, parameter
-  controls the distribution by truncating a quickly-decreasing
-  exponential distribution at parameter, and then
-  projecting onto integers between the bounds.
-  To be precise, with
-
-f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter))
-
-  Then value i between min and
-  max inclusive is drawn with probability:
-  f(x) - f(x + 1).
-  Intuitively, the larger parameter, the more
-  frequently values close to min are accessed, and the
-  less frequently values close to max are accessed.
-  The closer to 0 parameter, the flatter (more uniform)
-  the access distribution.
-  A crude approximation of the distribution is that the most frequent 1%
-  values in the range, close to min, are drawn
-  parameter% o

Re: [HACKERS] extend pgbench expressions with functions

2016-03-08 Thread Robert Haas

On Tue, Mar 8, 2016 at 11:48 AM, Fabien COELHO  wrote:
> If this is a blocker I'll do them, but I'm convince it is not what should be
> done.

Well, I think it's a blocker.  Exiting within deeply-nested code
instead of propagating the error upward does not strike me as a good
practice.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-03-08 Thread Fabien COELHO



Hello Robert,


Having a look at 33-b, this looks pretty good now, but:

// comments are not allowed.  I'd just remove the two you have.


Oops, C89 did not make it everywhere yet!


It make no sense to exit(1) and then return 0, so don't do that. [...]
This would get rid of the internal-error case here altogether in favor
of testing it via an assertion.


Ok. Fine with me.


I think that coerceToInt() should not exit(1) when an overflow occurs;
instead, I think the function should be declared as bool
coerceToInt(PgBenchValue *pval, int64 *result), and the error return
should be propagated back to the caller, which can then return false
as we do for other error cases.


This is the pain (ugly repeated code) that I wish to avoid, because then 
we cannot write a simple addition for doing an addition, but have to do 
several ugly tests instead. Ok, elegance is probably not a sufficient 
argument against doing that.


Moreover, I do not see any condition in which doing what you suggest makes 
much sense from the user perspective: if there is an internal error in the 
bench code it seems much more logical to ask for the user for a sensible 
bench script, because I would not know how to interpret tps on a bench 
with internal failures in the client code anyway.


For me, exiting immediatly on internal script errors is the right action.

If this is a blocker I'll do them, but I'm convince it is not what should 
be done.



I think that some of the places you've used coerceToInt() are not
appropriate.  Like, for example:

+   if
(coerceToInt(rval) == 0)




+
fprintf(stderr, "division by zero\n");
+   return false;
+   }

Now, if rval is out of range of an integer, that is going to overflow
while trying to see whether it should divide by zero.  Please work a
little harder here and in similar cases.


Ok.


 Maybe add a helper function
checkIntegerEquality(PgBenchValue *, int64).


Maybe.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-03-08 Thread Robert Haas

On Mon, Mar 7, 2016 at 4:58 AM, Fabien COELHO  wrote:
>> - 32-b: add double functions, including double variables
>> - 32-c: remove \setrandom support (advice to use \set + random instead)
>
> Here is a rebased version after Tom's updates, 33-b & 33-c. I also extended
> the floating point syntax to signed accept signed exponents, and changed the
> regexpr style to match Toms changes.

Having a look at 33-b, this looks pretty good now, but:

// comments are not allowed.  I'd just remove the two you have.

It make no sense to exit(1) and then return 0, so don't do that.  I
might write this code as:

if (pval->type == PGBT_INT)
   return pval->u.ival;

Assert(pval->type == PGBT_DOUBLE);
/* do double stuff */

This would get rid of the internal-error case here altogether in favor
of testing it via an assertion.

I think that coerceToInt() should not exit(1) when an overflow occurs;
instead, I think the function should be declared as bool
coerceToInt(PgBenchValue *pval, int64 *result), and the error return
should be propagated back to the caller, which can then return false
as we do for other error cases.

I think that some of the places you've used coerceToInt() are not
appropriate.  Like, for example:

+   if
(coerceToInt(rval) == 0)
+   {
+
fprintf(stderr, "division by zero\n");
+   return false;
+   }

Now, if rval is out of range of an integer, that is going to overflow
while trying to see whether it should divide by zero.  Please work a
little harder here and in similar cases.  Maybe add a helper function
checkIntegerEquality(PgBenchValue *, int64).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-03-07 Thread Fabien COELHO




- 32-b: add double functions, including double variables
- 32-c: remove \setrandom support (advice to use \set + random instead)


Here is a rebased version after Tom's updates, 33-b & 33-c. I also 
extended the floating point syntax to signed accept signed exponents, and 
changed the regexpr style to match Toms changes.


--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index cc80b3f..2133bf7 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -794,9 +794,10 @@ pgbench  options  dbname
 
 
  
-  Sets variable varname to an integer value calculated
+  Sets variable varname to a value calculated
   from expression.
   The expression may contain integer constants such as 5432,
+  double constants such as 3.14159,
   references to variables :variablename,
   and expressions composed of unary (-) or binary operators
   (+, -, *, /,
@@ -809,7 +810,7 @@ pgbench  options  dbname
   Examples:
 
 \set ntellers 10 * :scale
-\set aid (1021 * :aid) % (10 * :scale) + 1
+\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1
 
 

@@ -829,66 +830,35 @@ pgbench  options  dbname
  
 
  
-  By default, or when uniform is specified, all values in the
-  range are drawn with equal probability.  Specifying gaussian
-  or  exponential options modifies this behavior; each
-  requires a mandatory parameter which determines the precise shape of the
-  distribution.
- 
+  
+   
+
+ \setrandom n 1 10 or \setrandom n 1 10 uniform
+ is equivalent to \set n random(1, 10) and uses a uniform
+ distribution.
+
+   
 
- 
-  For a Gaussian distribution, the interval is mapped onto a standard
-  normal distribution (the classical bell-shaped Gaussian curve) truncated
-  at -parameter on the left and +parameter
-  on the right.
-  Values in the middle of the interval are more likely to be drawn.
-  To be precise, if PHI(x) is the cumulative distribution
-  function of the standard normal distribution, with mean mu
-  defined as (max + min) / 2.0, with
-
- f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
-(2.0 * PHI(parameter) - 1.0)
-
-  then value i between min and
-  max inclusive is drawn with probability:
-  f(i + 0.5) - f(i - 0.5).
-  Intuitively, the larger parameter, the more
-  frequently values close to the middle of the interval are drawn, and the
-  less frequently values close to the min and
-  max bounds. About 67% of values are drawn from the
-  middle 1.0 / parameter, that is a relative
-  0.5 / parameter around the mean, and 95% in the middle
-  2.0 / parameter, that is a relative
-  1.0 / parameter around the mean; for instance, if
-  parameter is 4.0, 67% of values are drawn from the
-  middle quarter (1.0 / 4.0) of the interval (i.e. from
-  3.0 / 8.0 to 5.0 / 8.0) and 95% from
-  the middle half (2.0 / 4.0) of the interval (second and
-  third quartiles). The minimum parameter is 2.0 for
-  performance of the Box-Muller transform.
- 
+  
+   
+\setrandom n 1 10 exponential 3.0 is equivalent to
+\set n random_exponential(1, 10, 3.0) and uses an
+exponential distribution.
+   
+  
 
- 
-  For an exponential distribution, parameter
-  controls the distribution by truncating a quickly-decreasing
-  exponential distribution at parameter, and then
-  projecting onto integers between the bounds.
-  To be precise, with
-
-f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter))
-
-  Then value i between min and
-  max inclusive is drawn with probability:
-  f(x) - f(x + 1).
-  Intuitively, the larger parameter, the more
-  frequently values close to min are accessed, and the
-  less frequently values close to max are accessed.
-  The closer to 0 parameter, the flatter (more uniform)
-  the access distribution.
-  A crude approximation of the distribution is that the most frequent 1%
-  values in the range, close to min, are drawn
-  parameter% of the time.
-  parameter value must be strictly positive.
+  
+   
+\setrandom n 1 10 gaussian 2.0 is equivalent to
+\set n random_gaussian(1, 10, 2.0), and uses a gaussian
+distribution.
+   
+  
+ 
+
+   See the documentation of these functions below for further information
+   about the precise shape of these distributions, depending on the value
+   of the parameter.
  
 
  
@@ -967,34 +937,6 @@ f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter))
 

   
-
-  
-   As an example, the full definition of the built-in TPC-B-like
-   transaction is:
-
-
-\set nbranches :scale
-\set ntellers 10 * :scale
-\set naccounts 100

Re: [HACKERS] extend pgbench expressions with functions

2016-03-04 Thread Fabien COELHO




Attached is the fixed patch for the array method.


Committed with a few tweaks, including running pgindent over some of it.


Thanks. So the first set of functions is in, and the operators are
executed as functions as well. Fabien, are you planning to send
rebased versions of the rest? By that I mean the switch of the
existing subcommands into equivalent functions and the handling of
double values as parameters for those functions.


Here they are:
 - 32-b: add double functions, including double variables
 - 32-c: remove \setrandom support (advice to use \set + random instead)

--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index cc80b3f..2133bf7 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -794,9 +794,10 @@ pgbench  options  dbname
 
 
  
-  Sets variable varname to an integer value calculated
+  Sets variable varname to a value calculated
   from expression.
   The expression may contain integer constants such as 5432,
+  double constants such as 3.14159,
   references to variables :variablename,
   and expressions composed of unary (-) or binary operators
   (+, -, *, /,
@@ -809,7 +810,7 @@ pgbench  options  dbname
   Examples:
 
 \set ntellers 10 * :scale
-\set aid (1021 * :aid) % (10 * :scale) + 1
+\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1
 
 

@@ -829,66 +830,35 @@ pgbench  options  dbname
  
 
  
-  By default, or when uniform is specified, all values in the
-  range are drawn with equal probability.  Specifying gaussian
-  or  exponential options modifies this behavior; each
-  requires a mandatory parameter which determines the precise shape of the
-  distribution.
- 
+  
+   
+
+ \setrandom n 1 10 or \setrandom n 1 10 uniform
+ is equivalent to \set n random(1, 10) and uses a uniform
+ distribution.
+
+   
 
- 
-  For a Gaussian distribution, the interval is mapped onto a standard
-  normal distribution (the classical bell-shaped Gaussian curve) truncated
-  at -parameter on the left and +parameter
-  on the right.
-  Values in the middle of the interval are more likely to be drawn.
-  To be precise, if PHI(x) is the cumulative distribution
-  function of the standard normal distribution, with mean mu
-  defined as (max + min) / 2.0, with
-
- f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
-(2.0 * PHI(parameter) - 1.0)
-
-  then value i between min and
-  max inclusive is drawn with probability:
-  f(i + 0.5) - f(i - 0.5).
-  Intuitively, the larger parameter, the more
-  frequently values close to the middle of the interval are drawn, and the
-  less frequently values close to the min and
-  max bounds. About 67% of values are drawn from the
-  middle 1.0 / parameter, that is a relative
-  0.5 / parameter around the mean, and 95% in the middle
-  2.0 / parameter, that is a relative
-  1.0 / parameter around the mean; for instance, if
-  parameter is 4.0, 67% of values are drawn from the
-  middle quarter (1.0 / 4.0) of the interval (i.e. from
-  3.0 / 8.0 to 5.0 / 8.0) and 95% from
-  the middle half (2.0 / 4.0) of the interval (second and
-  third quartiles). The minimum parameter is 2.0 for
-  performance of the Box-Muller transform.
- 
+  
+   
+\setrandom n 1 10 exponential 3.0 is equivalent to
+\set n random_exponential(1, 10, 3.0) and uses an
+exponential distribution.
+   
+  
 
- 
-  For an exponential distribution, parameter
-  controls the distribution by truncating a quickly-decreasing
-  exponential distribution at parameter, and then
-  projecting onto integers between the bounds.
-  To be precise, with
-
-f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter))
-
-  Then value i between min and
-  max inclusive is drawn with probability:
-  f(x) - f(x + 1).
-  Intuitively, the larger parameter, the more
-  frequently values close to min are accessed, and the
-  less frequently values close to max are accessed.
-  The closer to 0 parameter, the flatter (more uniform)
-  the access distribution.
-  A crude approximation of the distribution is that the most frequent 1%
-  values in the range, close to min, are drawn
-  parameter% of the time.
-  parameter value must be strictly positive.
+  
+   
+\setrandom n 1 10 gaussian 2.0 is equivalent to
+\set n random_gaussian(1, 10, 2.0), and uses a gaussian
+distribution.
+   
+  
+ 
+
+   See the documentation of these functions below for further information
+   about the precise shape of these distributions, depending on the value
+   of the parameter.
  
 
  
@@ -967,34 +937,6 @

Re: [HACKERS] extend pgbench expressions with functions

2016-03-04 Thread Fabien COELHO




Committed with a few tweaks, including running pgindent over some of it.


Thanks!

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-03-02 Thread Michael Paquier

On Wed, Mar 2, 2016 at 3:10 AM, Robert Haas  wrote:
> On Wed, Feb 17, 2016 at 3:22 AM, Fabien COELHO  wrote:
>> Indeed. My gcc 4.8.4 with --Wall does not show the warning, too bad.
>>
>> Attached is the fixed patch for the array method.
>
> Committed with a few tweaks, including running pgindent over some of it.

Thanks. So the first set of functions is in, and the operators are
executed as functions as well. Fabien, are you planning to send
rebased versions of the rest? By that I mean the switch of the
existing subcommands into equivalent functions and the handling of
double values as parameters for those functions.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-03-01 Thread Robert Haas

On Wed, Feb 17, 2016 at 3:22 AM, Fabien COELHO  wrote:
> Indeed. My gcc 4.8.4 with --Wall does not show the warning, too bad.
>
> Attached is the fixed patch for the array method.

Committed with a few tweaks, including running pgindent over some of it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-02-17 Thread Michael Paquier

On Wed, Feb 17, 2016 at 5:22 PM, Fabien COELHO  wrote:
>> \set aid 1 + 1
>> pgbench -f addition.sql -t 5000
>>
>> I have the following:
>> HEAD: 3.5~3.7M TPS
>> list method: 3.6~3.7M TPS
>> array method: 3.4~3.5M TPS
>> So all approaches have a comparable performance.
>
> Yep, the execution trace is pretty similar in all cases, maybe with a little
> more work for the array method, although I'm surprise that the difference is
> discernable.

Nah, that's mostly noise I think. My conclusion is that all approaches
are rather equivalent for the operators.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-02-17 Thread Fabien COELHO



Hello Michaël,


\set aid 1 + 1
pgbench -f addition.sql -t 5000

I have the following:
HEAD: 3.5~3.7M TPS
list method: 3.6~3.7M TPS
array method: 3.4~3.5M TPS
So all approaches have a comparable performance.


Yep, the execution trace is pretty similar in all cases, maybe with a 
little more work for the array method, although I'm surprise that the 
difference is discernable.



Btw, patch 2 is returning a warning for me:
It is trying to compare a 32b integer with an int64 value, evalFunc
needed an int64.


Indeed. My gcc 4.8.4 with --Wall does not show the warning, too bad.

Attached is the fixed patch for the array method.

--
Fabiendiff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index ade1b53..f39f341 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -786,7 +786,7 @@ pgbench  options  dbname
   
 
   
-   
+   
 
  \set varname expression
 
@@ -798,8 +798,10 @@ pgbench  options  dbname
   The expression may contain integer constants such as 5432,
   references to variables :variablename,
   and expressions composed of unary (-) or binary operators
-  (+, -, *, /, %)
-  with their usual associativity, and parentheses.
+  (+, -, *, /,
+  %) with their usual associativity,
+  function calls, and
+  parentheses.
  
 
  
@@ -994,6 +996,62 @@ END;
 
  
 
+ 
+  Built-In Functions
+
+   
+ The following functions are built into pgbench and
+ may be used in conjunction with
+ \set.
+   
+
+   
+   
+pgbench Functions
+
+ 
+  
+   Function
+   Return Type
+   Description
+   Example
+   Result
+  
+ 
+ 
+  
+   abs(a)
+   same as a
+   integer value
+   abs(-17)
+   17
+  
+  
+   debug(a)
+   same as a 
+   print to stderr the given argument
+   debug(5432)
+   5432
+  
+  
+   max(i [, ... ] )
+   integer
+   maximum value
+   max(5, 4, 3, 2)
+   5
+  
+  
+   min(i [, ... ] )
+   integer
+   minimum value
+   min(5, 4, 3, 2)
+   2
+  
+ 
+ 
+   
+ 
+
  
   Per-Transaction Logging
 
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
index 06ee04b..cac4d5e 100644
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -16,10 +16,13 @@
 
 PgBenchExpr *expr_parse_result;
 
+static PgBenchExprList *make_elist(PgBenchExpr *exp, PgBenchExprList *list);
 static PgBenchExpr *make_integer_constant(int64 ival);
 static PgBenchExpr *make_variable(char *varname);
-static PgBenchExpr *make_op(char operator, PgBenchExpr *lexpr,
+static PgBenchExpr *make_op(const char *operator, PgBenchExpr *lexpr,
 		PgBenchExpr *rexpr);
+static int find_func(const char *fname);
+static PgBenchExpr *make_func(const int fnumber, PgBenchExprList *args);
 
 %}
 
@@ -31,13 +34,15 @@ static PgBenchExpr *make_op(char operator, PgBenchExpr *lexpr,
 	int64		ival;
 	char	   *str;
 	PgBenchExpr *expr;
+	PgBenchExprList *elist;
 }
 
+%type  elist
 %type  expr
-%type  INTEGER
-%type  VARIABLE
+%type  INTEGER function
+%type  VARIABLE FUNCTION
 
-%token INTEGER VARIABLE
+%token INTEGER VARIABLE FUNCTION
 %token CHAR_ERROR /* never used, will raise a syntax error */
 
 /* Precedence: lowest to highest */
@@ -49,16 +54,25 @@ static PgBenchExpr *make_op(char operator, PgBenchExpr *lexpr,
 
 result: expr{ expr_parse_result = $1; }
 
+elist:  	{ $$ = NULL; }
+	| expr 	{ $$ = make_elist($1, NULL); }
+	| elist ',' expr		{ $$ = make_elist($3, $1); }
+	;
+
 expr: '(' expr ')'			{ $$ = $2; }
 	| '+' expr %prec UMINUS	{ $$ = $2; }
-	| '-' expr %prec UMINUS	{ $$ = make_op('-', make_integer_constant(0), $2); }
-	| expr '+' expr			{ $$ = make_op('+', $1, $3); }
-	| expr '-' expr			{ $$ = make_op('-', $1, $3); }
-	| expr '*' expr			{ $$ = make_op('*', $1, $3); }
-	| expr '/' expr			{ $$ = make_op('/', $1, $3); }
-	| expr '%' expr			{ $$ = make_op('%', $1, $3); }
+	| '-' expr %prec UMINUS	{ $$ = make_op("-", make_integer_constant(0), $2); }
+	| expr '+' expr			{ $$ = make_op("+", $1, $3); }
+	| expr '-' expr			{ $$ = make_op("-", $1, $3); }
+	| expr '*' expr			{ $$ = make_op("*", $1, $3); }
+	| expr '/' expr			{ $$ = make_op("/", $1, $3); }
+	| expr '%' expr			{ $$ = make_op("%", $1, $3); }
 	| INTEGER{ $$ = make_integer_constant($1); }
 	| VARIABLE { $$ = make_variable($1); }
+	| function '(' elist ')'{ $$ = make_func($1, $3); }
+	;
+
+function: FUNCTION			{ $$ = find_func($1); pg_free($1); }
 	;
 
 %%
@@ -84,14 +98,131 @@ make_variable(char *varname)
 }
 
 static PgBenchExpr *
-make_op(char operator, PgBenchExpr *lexpr, PgBenchExpr *rexpr)
+make_op(const char *operator, PgBenchExpr *lexpr, PgBenchExpr *rexpr)
+{
+	return make_func(find_func(operator),
+	 make_elist(rexpr, make_elist(lexpr, NULL)));
+}
+
+/*
+ * List of available functions:
+ * - fname: function name
+ * - nargs: number of arguments
+ *

Re: [HACKERS] extend pgbench expressions with functions

2016-02-16 Thread Michael Paquier

On Tue, Feb 16, 2016 at 11:12 PM, Fabien COELHO  wrote:
>
> Hello Robert,
>
>>> However, for obvious reasons the committer opinion prevails:-)
>>
>>
>> You're welcome to solicit other opinions.  I'm not unwilling to give
>> way if there's a chorus of support for the way you've done it.
>
>
> Hmmm. I do not expect much chorus on such a minor subject:-)
>
>> But to me it sounds like you're saying that it doesn't really matter
>> whether the system is scalable and maintainable because we only have 5
>> functions right now, which is a design philosophy with which I just don't
>> agree.
>
>
> The design does not aim at scalability but at simplicity, otherwise I would
> have done things quite differently: with many functions the "switch" based
> selection does not scale anyway.
>
> Anyway, attached are two patches, one for each approach.
>
> The array (second patch) is not too bad if one agrees with a maximum number
> of arguments, and also as I did not change the list structure coming from
> the parser, so it does not need to manage the number of arguments in the
> function structure. The code for min/max is also simpler when dealing with
> an array instead of a linked list. I do not like much array references in
> the code, so I tried to avoid them by using lval/rval scalars for operator
> operands.
>
> Please choose the one you prefer, and I'll adapt the remaining stuff to your
> choice.

For those two patches and HEAD, it is possible to do a direct
performance comparison for simple operator functions, as those are
being switched to behave as functions. So doing the following for 50M
"transactions" on my laptop:
\set aid 1 + 1
pgbench -f addition.sql -t 5000

I have the following:
HEAD: 3.5~3.7M TPS
list method: 3.6~3.7M TPS
array method: 3.4~3.5M TPS
So all approaches have a comparable performance.

Btw, patch 2 is returning a warning for me:
pgbench.c:1060:16: warning: comparison of constant
-9223372036854775808 with expression of type 'int' is always false
[-Wtautological-constant-out-of-range-compare]
if (lval == PG_INT64_MIN)
It is trying to compare a 32b integer with an int64 value, evalFunc
needed an int64.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-02-16 Thread Fabien COELHO



Hello Robert,


However, for obvious reasons the committer opinion prevails:-)


You're welcome to solicit other opinions.  I'm not unwilling to give
way if there's a chorus of support for the way you've done it.


Hmmm. I do not expect much chorus on such a minor subject:-)

But to me it sounds like you're saying that it doesn't really matter 
whether the system is scalable and maintainable because we only have 5 
functions right now, which is a design philosophy with which I just 
don't agree.


The design does not aim at scalability but at simplicity, otherwise I 
would have done things quite differently: with many functions the "switch" 
based selection does not scale anyway.


Anyway, attached are two patches, one for each approach.

The array (second patch) is not too bad if one agrees with a maximum 
number of arguments, and also as I did not change the list structure 
coming from the parser, so it does not need to manage the number of 
arguments in the function structure. The code for min/max is also simpler 
when dealing with an array instead of a linked list. I do not like much 
array references in the code, so I tried to avoid them by using lval/rval 
scalars for operator operands.


Please choose the one you prefer, and I'll adapt the remaining stuff to 
your choice.


--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index ade1b53..f39f341 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -786,7 +786,7 @@ pgbench  options  dbname
   
 
   
-   
+   
 
  \set varname expression
 
@@ -798,8 +798,10 @@ pgbench  options  dbname
   The expression may contain integer constants such as 5432,
   references to variables :variablename,
   and expressions composed of unary (-) or binary operators
-  (+, -, *, /, %)
-  with their usual associativity, and parentheses.
+  (+, -, *, /,
+  %) with their usual associativity,
+  function calls, and
+  parentheses.
  
 
  
@@ -994,6 +996,62 @@ END;
 
  
 
+ 
+  Built-In Functions
+
+   
+ The following functions are built into pgbench and
+ may be used in conjunction with
+ \set.
+   
+
+   
+   
+pgbench Functions
+
+ 
+  
+   Function
+   Return Type
+   Description
+   Example
+   Result
+  
+ 
+ 
+  
+   abs(a)
+   same as a
+   integer value
+   abs(-17)
+   17
+  
+  
+   debug(a)
+   same as a 
+   print to stderr the given argument
+   debug(5432)
+   5432
+  
+  
+   max(i [, ... ] )
+   integer
+   maximum value
+   max(5, 4, 3, 2)
+   5
+  
+  
+   min(i [, ... ] )
+   integer
+   minimum value
+   min(5, 4, 3, 2)
+   2
+  
+ 
+ 
+   
+ 
+
  
   Per-Transaction Logging
 
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
index 06ee04b..cac4d5e 100644
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -16,10 +16,13 @@
 
 PgBenchExpr *expr_parse_result;
 
+static PgBenchExprList *make_elist(PgBenchExpr *exp, PgBenchExprList *list);
 static PgBenchExpr *make_integer_constant(int64 ival);
 static PgBenchExpr *make_variable(char *varname);
-static PgBenchExpr *make_op(char operator, PgBenchExpr *lexpr,
+static PgBenchExpr *make_op(const char *operator, PgBenchExpr *lexpr,
 		PgBenchExpr *rexpr);
+static int find_func(const char *fname);
+static PgBenchExpr *make_func(const int fnumber, PgBenchExprList *args);
 
 %}
 
@@ -31,13 +34,15 @@ static PgBenchExpr *make_op(char operator, PgBenchExpr *lexpr,
 	int64		ival;
 	char	   *str;
 	PgBenchExpr *expr;
+	PgBenchExprList *elist;
 }
 
+%type  elist
 %type  expr
-%type  INTEGER
-%type  VARIABLE
+%type  INTEGER function
+%type  VARIABLE FUNCTION
 
-%token INTEGER VARIABLE
+%token INTEGER VARIABLE FUNCTION
 %token CHAR_ERROR /* never used, will raise a syntax error */
 
 /* Precedence: lowest to highest */
@@ -49,16 +54,25 @@ static PgBenchExpr *make_op(char operator, PgBenchExpr *lexpr,
 
 result: expr{ expr_parse_result = $1; }
 
+elist:  	{ $$ = NULL; }
+	| expr 	{ $$ = make_elist($1, NULL); }
+	| elist ',' expr		{ $$ = make_elist($3, $1); }
+	;
+
 expr: '(' expr ')'			{ $$ = $2; }
 	| '+' expr %prec UMINUS	{ $$ = $2; }
-	| '-' expr %prec UMINUS	{ $$ = make_op('-', make_integer_constant(0), $2); }
-	| expr '+' expr			{ $$ = make_op('+', $1, $3); }
-	| expr '-' expr			{ $$ = make_op('-', $1, $3); }
-	| expr '*' expr			{ $$ = make_op('*', $1, $3); }
-	| expr '/' expr			{ $$ = make_op('/', $1, $3); }
-	| expr '%' expr			{ $$ = make_op('%', $1, $3); }
+	| '-' expr %prec UMINUS	{ $$ = make_op("-", make_integer_constant(0), $2); }
+	| expr '+' expr			{ $$ = make_op("+", $1, $3); }
+	| expr '-' expr			{ $$ = make_op("-", $1, $3); }
+	| expr '*' expr			{ $$ = make_op("*", $1, $3); }
+	| expr '/' expr			{ $$ = make_op("/", $1, $3); }
+	| expr '%' expr			{ $$

Re: [HACKERS] extend pgbench expressions with functions

2016-02-16 Thread Robert Haas

On Tue, Feb 16, 2016 at 5:18 AM, Fabien COELHO  wrote:
>>> Good point. One simple idea here would be to use a custom pgbench
>>> script that has no SQL commands and just calculates the values of some
>>> parameters to measure the impact without depending on the backend,
>>> with a fixed number of transactions.
>>
>> Sure, we could do that.  But whether it materially changes pgbench -S
>> results, say, is a lot more important.
>
>
> Indeed. Several runs on my laptop:
>
>   ~ 40-54 tps with master using:
> \set naccounts 10 * :scale
> \setrandom aid 1 :naccounts
>
>   ~ 43-53 tps with full function patch using:
> \set naccounts 10 * :scale
> \setrandom aid 1 :naccounts
>
>   ~ 73-89 tps with full function patch using:
> \set aid random(1, 10 * :scale)
>
> The performance is pretty similar on the same script. The real pain is
> variable management, avoiding some is a win.

Wow, that's pretty nice.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-02-16 Thread Robert Haas

On Tue, Feb 16, 2016 at 3:53 AM, Fabien COELHO  wrote:
> My opinion is that the submitted version is simple and fine for the purpose,
> and the plan you suggest replaces 5*2 repetitions by a lot of code and
> complexity, which is not desirable and should be avoided.
>
> However, for obvious reasons the committer opinion prevails:-)

You're welcome to solicit other opinions.  I'm not unwilling to give
way if there's a chorus of support for the way you've done it.  But to
me it sounds like you're saying that it doesn't really matter whether
the system is scalable and maintainable because we only have 5
functions right now, which is a design philosophy with which I just
don't agree.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-02-16 Thread Fabien COELHO



Hello Robert,


Good point. One simple idea here would be to use a custom pgbench
script that has no SQL commands and just calculates the values of some
parameters to measure the impact without depending on the backend,
with a fixed number of transactions.


Sure, we could do that.  But whether it materially changes pgbench -S
results, say, is a lot more important.


Indeed. Several runs on my laptop:

  ~ 40-54 tps with master using:
\set naccounts 10 * :scale
\setrandom aid 1 :naccounts

  ~ 43-53 tps with full function patch using:
\set naccounts 10 * :scale
\setrandom aid 1 :naccounts

  ~ 73-89 tps with full function patch using:
\set aid random(1, 10 * :scale)

The performance is pretty similar on the same script. The real pain is 
variable management, avoiding some is a win.


However, as you suggest, the tps impact even with -M prepared -S is 
nought, because the internal scripting time in pgbench is much smaller 
than the time to do actual connecting and querying.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-02-16 Thread Fabien COELHO



Hello Robert,

[...] But we can't have things that are logically part of patch #2 just 
tossed in with patch #1.


So you want integer functions without type in patch #1.


I was in the middle of ripping that out of the patch when I realized
that evalFunc() is pretty badly designed.


Probably, we are just at v30:-)

What you've done is made it the job of each particular function to 
evaluate its arguments.


Yep.

I did that for the multiple issues you raise below, and some you did not 
yet foresee: handling a variable number of arguments (0, 1, 2, 3, n), 
avoiding dynamic structures or arbitrary limitations, checking for a valid 
number of arguments, and also the fact that the evaluation call was not 
too horrible (2 lines per evaluation, factored out by groups of functions 
[operators, min/max, randoms, ...], it is not fully repeated).


There are 5 sub-expression evaluation in the function, totalling 10 LOCs.

TEN.


I don't think that's a good plan.


The one you suggest does not strike me as intrinsically better: it is a 
trade between some code ugliness (5 repeated evals = 10 lines, a little 
more with the double functions, probably 20 lines) to other uglinesses 
(number of args limit or dynamic allocation, array length to manage and 
test somehow, list to array conversion code, things that will mean far 
more than the few lines of repeated code under discussion).


So I think that it is just a choice between two plans, really, the better 
of which is debatable.



I experimented with trying to do this and ran into a problem:


Yep. There are other problems, all of which solvable obviously, but which 
means that a lot more that 10 lines will be added to avoid the 5*2 lines 
repetition.


My opinion is that the submitted version is simple and fine for the 
purpose, and the plan you suggest replaces 5*2 repetitions by a lot of 
code and complexity, which is not desirable and should be avoided.


However, for obvious reasons the committer opinion prevails:-)

After considering the various points I raised above, could you confirm 
that you do still require the implementation of this array approach before 
I spend time doing such a thing?



I also went over your documentation changes.


Thanks, this looks better.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-02-16 Thread Robert Haas

On Tue, Feb 16, 2016 at 1:55 AM, Michael Paquier
 wrote:
> On Tue, Feb 16, 2016 at 9:18 AM, Robert Haas  wrote:
>> I experimented with trying to do this and ran into a problem: where
>> exactly would you store the evaluated arguments when you don't know
>> how many of them there will be?  And even if you did know how many of
>> them there will be, wouldn't that mean that evalFunc or evaluateExpr
>> would have to palloc a buffer of the correct size for each invocation?
>>  That's far more heavyweight than the current implementation, and
>> minimizing CPU usage inside pgbench is a concern.  It would be
>> interesting to do some pgbench runs with this patch, or the final
>> patch, and see what effect it has on the TPS numbers, if any, and I
>> think we should.  But the first concern is to minimize any negative
>> impact, so let's talk about how to do that.
>
> Good point. One simple idea here would be to use a custom pgbench
> script that has no SQL commands and just calculates the values of some
> parameters to measure the impact without depending on the backend,
> with a fixed number of transactions.

Sure, we could do that.  But whether it materially changes pgbench -S
results, say, is a lot more important.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-02-15 Thread Michael Paquier

On Tue, Feb 16, 2016 at 9:18 AM, Robert Haas  wrote:
> I experimented with trying to do this and ran into a problem: where
> exactly would you store the evaluated arguments when you don't know
> how many of them there will be?  And even if you did know how many of
> them there will be, wouldn't that mean that evalFunc or evaluateExpr
> would have to palloc a buffer of the correct size for each invocation?
>  That's far more heavyweight than the current implementation, and
> minimizing CPU usage inside pgbench is a concern.  It would be
> interesting to do some pgbench runs with this patch, or the final
> patch, and see what effect it has on the TPS numbers, if any, and I
> think we should.  But the first concern is to minimize any negative
> impact, so let's talk about how to do that.

Good point. One simple idea here would be to use a custom pgbench
script that has no SQL commands and just calculates the values of some
parameters to measure the impact without depending on the backend,
with a fixed number of transactions.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-02-15 Thread Robert Haas

On Mon, Feb 15, 2016 at 4:58 AM, Fabien COELHO  wrote:
> Indeed!
>
>> Taking patch 1 as a completely independent thing, there is no need to
>> introduce PgBenchValueType yet. Similar remark for setIntValue and
>> coerceToInt. They are useful afterwards when introducing double types to be
>> able to handle double input parameters for the gaussian and other functions.
>
> Yes. This is exactly the pain I'm trying to avoid, creating a different
> implementation for the first patch, which is just overriden when the second
> part is applied...

Splitting a patch means breaking it into independently committable
sub-patches, not just throwing each line of the diff into a different
pile as best you can.  I'm with Michael: that part doesn't belong in
this patch.  If we want to have an infrastructure refactoring patch
that just replaces every relevant use of int64 with PgBenchValue, a
union supporting only integer values, then we can do that first and
have a later patch introduce double as a separate change.  But we
can't have things that are logically part of patch #2 just tossed in
with patch #1.

I was in the middle of ripping that out of the patch when I realized
that evalFunc() is pretty badly designed.  What you've done is made it
the job of each particular function to evaluate its arguments.  I
don't think that's a good plan.  I think that when we discover that
we've got a function, we should evaluate all of the arguments that
were passed to it using common code that is shared across all types of
functions and operators.  Then, the evaluated arguments should be
passed to the function-specific code, which can do as it likes with
them.  This way, you have less code that is specific to particular
operations and more common code, which is generally a good thing.
Every expression evaluation engine that I've ever heard of works this
way - see, for example, the PostgreSQL backend.

I experimented with trying to do this and ran into a problem: where
exactly would you store the evaluated arguments when you don't know
how many of them there will be?  And even if you did know how many of
them there will be, wouldn't that mean that evalFunc or evaluateExpr
would have to palloc a buffer of the correct size for each invocation?
 That's far more heavyweight than the current implementation, and
minimizing CPU usage inside pgbench is a concern.  It would be
interesting to do some pgbench runs with this patch, or the final
patch, and see what effect it has on the TPS numbers, if any, and I
think we should.  But the first concern is to minimize any negative
impact, so let's talk about how to do that.

I think we should limit the number of arguments to a function to, say,
16, so that an array of int64s or PgBenchValues long enough to hold an
entire argument list can be stack-allocated.  The backend's limit is
higher, but the only reason we need a value higher than 2 here is
because you've chosen to introduce variable-argument functions; but I
think 16 is enough for any practical purpose.  Then, I think we should
also change the parse tree representation so that transforms the
linked-list into an array stored within the PgBenchExpr, so that you
can access the expression for argument i as expr->u.arg[i].  Then we
can write this is a loop that evaluates each expression in an array of
expressions and stores the result in an array of values.  That seems
like it would be both cleaner and faster than what you've got here
right now.  It's also more similar to what you did with the function
name itself: the most trivial thing the parser could do is store a
pointer to the function or operator name, but that would be slow, so
instead it looks up the function and stores a constant.

I also went over your documentation changes.  I think you inserted the
new table smack dab in the middle of a section in a place that it
doesn't really belong.  The version attached makes it into its own
, puts it in a new section a bit further down so that it
doesn't break up the flow, and has a few other tweaks that I think are
improvements.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index ade1b53..f39f341 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -786,7 +786,7 @@ pgbench  options  dbname

-   
+   

  \set varname expression

@@ -798,8 +798,10 @@ pgbench  options  dbname
   The expression may contain integer constants such as 5432,
   references to variables :variablename,
   and expressions composed of unary (-) or binary operators
-  (+, -, *, /, %)
-  with their usual associativity, and parentheses.
+  (+, -, *, /,
+  %) with their usual associativity,
+  function calls, and
+  parentheses.

@@ -994,6 +996,62 @@ END;

+ 
+  Built-In Functions
+
+   
+ The following functions are built into pgbench and
+ may b

Re: [HACKERS] extend pgbench expressions with functions

2016-02-15 Thread Fabien COELHO



Hello Michaël,


+ * Recursive evaluation of int or double expressions
+ *
+ * Note that currently only integer variables are available, with values
+ * stored as text.
This comment is incorrect, we only care about integers in this patch.


Indeed!

Taking patch 1 as a completely independent thing, there is no need to 
introduce PgBenchValueType yet. Similar remark for setIntValue and 
coerceToInt. They are useful afterwards when introducing double types to 
be able to handle double input parameters for the gaussian and other 
functions.


Yes. This is exactly the pain I'm trying to avoid, creating a different 
implementation for the first patch, which is just overriden when the 
second part is applied...


So I'm trying to compromise, having a several part patch *but* having the 
infrastructure ready for the second patch which adds the double type.


Note that the first patch without the second is a loss of time for 
everyone, as the nearly only useful functions are the randoms, which 
require a double argument, so it does not make sense to apply the first 
one if the second one is not to be applied, I think.



[...]
(INT64_MIN / -1) should error.
(INT64_MIN % -1) should result in 0.
This is missing the division handling.


Oops, indeed I totally messed up when merging the handling of / and %:-(

I have found another issue in the (a) patch: the internal scripts were 
using the future random function which do not yet exist, as they are in 
patch (b).


Here is a three part v30, which still includes the infrastructure for 
future types in the a patch, see my argumentation above.


--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index ade1b53..9d5eb32 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -798,8 +798,11 @@ pgbench  options  dbname
   The expression may contain integer constants such as 5432,
   references to variables :variablename,
   and expressions composed of unary (-) or binary operators
-  (+, -, *, /, %)
-  with their usual associativity, and parentheses.
+  (+, -, *, /,
+  %) with their usual associativity, function calls and
+  parentheses.
+   shows the available
+  functions.
  
 
  
@@ -965,6 +968,52 @@ f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter))

   
 
+   
+   
+PgBench Functions
+
+ 
+  
+   Function
+   Return Type
+   Description
+   Example
+   Result
+  
+ 
+ 
+  
+   abs(a)
+   same as a
+   integer value
+   abs(-17)
+   17
+  
+  
+   debug(a)
+   same asa 
+   print to stderr the given argument
+   debug(5432)
+   5432
+  
+  
+   max(i [, ... ] )
+   integer
+   maximum value
+   max(5, 4, 3, 2)
+   5
+  
+  
+   min(i [, ... ] )
+   integer
+   minimum value
+   min(5, 4, 3, 2)
+   2
+  
+ 
+ 
+   
+
   
As an example, the full definition of the built-in TPC-B-like
transaction is:
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
index 06ee04b..93c6173 100644
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -16,10 +16,13 @@
 
 PgBenchExpr *expr_parse_result;
 
+static PgBenchExprList *make_elist(PgBenchExpr *exp, PgBenchExprList *list);
 static PgBenchExpr *make_integer_constant(int64 ival);
 static PgBenchExpr *make_variable(char *varname);
-static PgBenchExpr *make_op(char operator, PgBenchExpr *lexpr,
+static PgBenchExpr *make_op(const char *operator, PgBenchExpr *lexpr,
 		PgBenchExpr *rexpr);
+static int find_func(const char *fname);
+static PgBenchExpr *make_func(const int fnumber, PgBenchExprList *args);
 
 %}
 
@@ -31,13 +34,15 @@ static PgBenchExpr *make_op(char operator, PgBenchExpr *lexpr,
 	int64		ival;
 	char	   *str;
 	PgBenchExpr *expr;
+	PgBenchExprList *elist;
 }
 
+%type  elist
 %type  expr
-%type  INTEGER
-%type  VARIABLE
+%type  INTEGER function
+%type  VARIABLE FUNCTION
 
-%token INTEGER VARIABLE
+%token INTEGER VARIABLE FUNCTION
 %token CHAR_ERROR /* never used, will raise a syntax error */
 
 /* Precedence: lowest to highest */
@@ -49,16 +54,25 @@ static PgBenchExpr *make_op(char operator, PgBenchExpr *lexpr,
 
 result: expr{ expr_parse_result = $1; }
 
+elist:  	{ $$ = NULL; }
+	| expr 	{ $$ = make_elist($1, NULL); }
+	| elist ',' expr		{ $$ = make_elist($3, $1); }
+	;
+
 expr: '(' expr ')'			{ $$ = $2; }
 	| '+' expr %prec UMINUS	{ $$ = $2; }
-	| '-' expr %prec UMINUS	{ $$ = make_op('-', make_integer_constant(0), $2); }
-	| expr '+' expr			{ $$ = make_op('+', $1, $3); }
-	| expr '-' expr			{ $$ = make_op('-', $1, $3); }
-	| expr '*' expr			{ $$ = make_op('*', $1, $3); }
-	| expr '/' expr			{ $$ = make_op('/', $1, $3); }
-	| expr '%' expr			{ $$ = make_op('%', $1, $3); }
+	| '-' expr %prec UMINUS	{ $$ = make_op("-", make_integer_constant(0), $2); }
+	| e

Re: [HACKERS] extend pgbench expressions with functions

2016-02-14 Thread Michael Paquier

On Mon, Feb 15, 2016 at 5:21 AM, Robert Haas  wrote:
> On Sun, Feb 14, 2016 at 11:28 AM, Fabien COELHO  wrote:
>> Here is a 3 part v29:
>>
>> a) add support for integer functions in pgbench, including turning
>>operators as functions, as well as some minimal infrastructure for
>>additional types (this allows to minimize the diffs with the next
>>patch which adds double).
>>
>> b) add support for doubles, including setting double variables.
>>Note that variable are not explicitely typed. Add some
>>double-related functions, most interesting of them for pgbench
>>are the randoms.
>>
>> c) remove \setrandom support (as thanks to part b \set x random(...) does
>>the same). Prints an error pointing to the replacement if \setrandom is
>>used in a pgbench script.
>
> Thanks, I'll review these as soon as I can get to it.

I got around to look at (a) in this set.

+   if ((PGBENCH_FUNCTIONS[fnumber].nargs >= 0 &&
+PGBENCH_FUNCTIONS[fnumber].nargs != elist_length(args)) ||
+   /* check at least one arg for min & max */
+   (PGBENCH_FUNCTIONS[fnumber].nargs == -1 &&
+elist_length(args) == 0))
+   expr_yyerror_more("unexpected number of arguments",
+ PGBENCH_FUNCTIONS[fnumber].fname);
We could split that into two parts, each one with a more precise error message:
- For functions that expect at least one argument: "at least one
argument was expected, none found".
- For functions that expect N arguments: "N arguments expected, but M found"

+   "\\set aid random(1, " CppAsString2(naccounts) " * :scale)\n"
"SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n"
}
 };

-
 /* Function prototypes */
Noise.

+ * Recursive evaluation of int or double expressions
+ *
+ * Note that currently only integer variables are available, with values
+ * stored as text.
This comment is incorrect, we only care about integers in this patch.

Taking patch 1 as a completely independent thing, there is no need to
introduce PgBenchValueType yet. Similar remark for setIntValue and
coerceToInt. They are useful afterwards when introducing double types
to be able to handle double input parameters for the gaussian and
other functions.

-   /*
-* INT64_MIN / -1 is problematic, since the result
-* can't be represented on a two's-complement machine.
-* Some machines produce INT64_MIN, some produce zero,
-* some throw an exception. We can dodge the problem
-* by recognizing that division by -1 is the same as
-* negation.
-*/
-   if (rval == -1)
+   if (coerceToInt(&rval) == -1)
{
-   *retval = -lval;
-
-   /* overflow check (needed for INT64_MIN) */
-   if (lval == PG_INT64_MIN)
-   {
-   fprintf(stderr, "bigint out of range\n");
-   return false;
-   }
+   setIntValue(retval, 0);
+   return true;
}
(INT64_MIN / -1) should error. (INT64_MIN % -1) should result in 0.
This is missing the division handling.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-02-14 Thread Robert Haas

On Sun, Feb 14, 2016 at 11:28 AM, Fabien COELHO  wrote:
> Here is a 3 part v29:
>
> a) add support for integer functions in pgbench, including turning
>operators as functions, as well as some minimal infrastructure for
>additional types (this allows to minimize the diffs with the next
>patch which adds double).
>
> b) add support for doubles, including setting double variables.
>Note that variable are not explicitely typed. Add some
>double-related functions, most interesting of them for pgbench
>are the randoms.
>
> c) remove \setrandom support (as thanks to part b \set x random(...) does
>the same). Prints an error pointing to the replacement if \setrandom is
>used in a pgbench script.

Thanks, I'll review these as soon as I can get to it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-02-14 Thread Fabien COELHO



Hello Michaël,


I'll be happy if you do the review of the resulting split.


OK, I am fine with this scenario as well. I have luckily done nothing yet.


Here is a 3 part v29:

a) add support for integer functions in pgbench, including turning
   operators as functions, as well as some minimal infrastructure for
   additional types (this allows to minimize the diffs with the next
   patch which adds double).

b) add support for doubles, including setting double variables.
   Note that variable are not explicitely typed. Add some
   double-related functions, most interesting of them for pgbench
   are the randoms.

c) remove \setrandom support (as thanks to part b \set x random(...) does
   the same). Prints an error pointing to the replacement if \setrandom is
   used in a pgbench script.

--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index ade1b53..9d5eb32 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -798,8 +798,11 @@ pgbench  options  dbname
   The expression may contain integer constants such as 5432,
   references to variables :variablename,
   and expressions composed of unary (-) or binary operators
-  (+, -, *, /, %)
-  with their usual associativity, and parentheses.
+  (+, -, *, /,
+  %) with their usual associativity, function calls and
+  parentheses.
+   shows the available
+  functions.
  
 
  
@@ -965,6 +968,52 @@ f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter))

   
 
+   
+   
+PgBench Functions
+
+ 
+  
+   Function
+   Return Type
+   Description
+   Example
+   Result
+  
+ 
+ 
+  
+   abs(a)
+   same as a
+   integer value
+   abs(-17)
+   17
+  
+  
+   debug(a)
+   same asa 
+   print to stderr the given argument
+   debug(5432)
+   5432
+  
+  
+   max(i [, ... ] )
+   integer
+   maximum value
+   max(5, 4, 3, 2)
+   5
+  
+  
+   min(i [, ... ] )
+   integer
+   minimum value
+   min(5, 4, 3, 2)
+   2
+  
+ 
+ 
+   
+
   
As an example, the full definition of the built-in TPC-B-like
transaction is:
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
index 06ee04b..93c6173 100644
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -16,10 +16,13 @@
 
 PgBenchExpr *expr_parse_result;
 
+static PgBenchExprList *make_elist(PgBenchExpr *exp, PgBenchExprList *list);
 static PgBenchExpr *make_integer_constant(int64 ival);
 static PgBenchExpr *make_variable(char *varname);
-static PgBenchExpr *make_op(char operator, PgBenchExpr *lexpr,
+static PgBenchExpr *make_op(const char *operator, PgBenchExpr *lexpr,
 		PgBenchExpr *rexpr);
+static int find_func(const char *fname);
+static PgBenchExpr *make_func(const int fnumber, PgBenchExprList *args);
 
 %}
 
@@ -31,13 +34,15 @@ static PgBenchExpr *make_op(char operator, PgBenchExpr *lexpr,
 	int64		ival;
 	char	   *str;
 	PgBenchExpr *expr;
+	PgBenchExprList *elist;
 }
 
+%type  elist
 %type  expr
-%type  INTEGER
-%type  VARIABLE
+%type  INTEGER function
+%type  VARIABLE FUNCTION
 
-%token INTEGER VARIABLE
+%token INTEGER VARIABLE FUNCTION
 %token CHAR_ERROR /* never used, will raise a syntax error */
 
 /* Precedence: lowest to highest */
@@ -49,16 +54,25 @@ static PgBenchExpr *make_op(char operator, PgBenchExpr *lexpr,
 
 result: expr{ expr_parse_result = $1; }
 
+elist:  	{ $$ = NULL; }
+	| expr 	{ $$ = make_elist($1, NULL); }
+	| elist ',' expr		{ $$ = make_elist($3, $1); }
+	;
+
 expr: '(' expr ')'			{ $$ = $2; }
 	| '+' expr %prec UMINUS	{ $$ = $2; }
-	| '-' expr %prec UMINUS	{ $$ = make_op('-', make_integer_constant(0), $2); }
-	| expr '+' expr			{ $$ = make_op('+', $1, $3); }
-	| expr '-' expr			{ $$ = make_op('-', $1, $3); }
-	| expr '*' expr			{ $$ = make_op('*', $1, $3); }
-	| expr '/' expr			{ $$ = make_op('/', $1, $3); }
-	| expr '%' expr			{ $$ = make_op('%', $1, $3); }
+	| '-' expr %prec UMINUS	{ $$ = make_op("-", make_integer_constant(0), $2); }
+	| expr '+' expr			{ $$ = make_op("+", $1, $3); }
+	| expr '-' expr			{ $$ = make_op("-", $1, $3); }
+	| expr '*' expr			{ $$ = make_op("*", $1, $3); }
+	| expr '/' expr			{ $$ = make_op("/", $1, $3); }
+	| expr '%' expr			{ $$ = make_op("%", $1, $3); }
 	| INTEGER{ $$ = make_integer_constant($1); }
 	| VARIABLE { $$ = make_variable($1); }
+	| function '(' elist ')'{ $$ = make_func($1, $3); }
+	;
+
+function: FUNCTION			{ $$ = find_func($1); pg_free($1); }
 	;
 
 %%
@@ -68,8 +82,9 @@ make_integer_constant(int64 ival)
 {
 	PgBenchExpr *expr = pg_malloc(sizeof(PgBenchExpr));
 
-	expr->etype = ENODE_INTEGER_CONSTANT;
-	expr->u.integer_constant.ival = ival;
+	expr->etype = ENODE_CONSTANT;
+	expr->u.constant.type = PGBT_INT;
+	expr->u.constant.u.ival = ival;
 	return expr;
 }
 
@@ -84,14 +99,128

Re: [HACKERS] extend pgbench expressions with functions

2016-02-14 Thread Michael Paquier

On Sun, Feb 14, 2016 at 4:42 PM, Fabien COELHO  wrote:
>
>> So I would be fine to do a portion of the legwork and extract from this
>> patch something smaller that adds only functions as a first step, with the
>> minimum set of functions I mentioned upthread. Robert, Alvaro, Fabien, does
>> that sound fine to you?
>
>
> Thanks, but this is my c*, I have a few hours of travelling this evening,
> I'll do it then.
>
> I'll be happy if you do the review of the resulting split.

OK, I am fine with this scenario as well. I have luckily done nothing yet.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-02-13 Thread Fabien COELHO



So I would be fine to do a portion of the legwork and extract from this 
patch something smaller that adds only functions as a first step, with 
the minimum set of functions I mentioned upthread. Robert, Alvaro, 
Fabien, does that sound fine to you?


Thanks, but this is my c*, I have a few hours of travelling this evening, 
I'll do it then.


I'll be happy if you do the review of the resulting split.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-02-13 Thread Fabien COELHO



Hello Robert,


You know, you make comments like this pretty much every time anybody
suggests that you should change anything in any patch you write.


Well, not everytime...

I'm tired of doing and undoing things: I do a limited A because I'm 
cautious not to spend too much time that would go down the drain. Then I'm 
told "do A+B" to show that A is worthwhile. Then I'm told "you should not 
have done B with A, but submit A for the infrastructure and then B for the 
additional features", which was precisely my initial intent...


So this is really the going forward and backwards because of the process 
that makes me write these remarks.



[...] I don't like it when people call my ideas dumb.


Your idea is not dumb, I'm sorry if I have implied that, and I apologise. 
I like a clean history as much as everyone else.


Having to unmix features and documentations is still "dumb work", even if 
the purpose is not dumb, I stand by this word for this part.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-02-13 Thread Michael Paquier

On Sun, Feb 14, 2016 at 10:05 AM, Robert Haas  wrote:
> On Sat, Feb 13, 2016 at 6:19 PM, Michael Paquier
>  wrote:
>> On Sun, Feb 14, 2016 at 12:37 AM, Fabien COELHO wrote:
>>> The two features are highly intermix, so it can only be dependent patches,
>>> first to add a function infrastructure and probably some support for doubles
>>> altough it would not be used, then to add doubles & their functions.
>>>
>>> A real pain is the documentation, because it means writing a documentation
>>> with only integer functions, then overwriting it with doubles. This is dumb
>>> work, really, for the sake of "a cleaner git history", the beauty of it no
>>> one will ever contemplate...
>>
>> FWIW, I care a lot about splitting as much as possible patches where
>> it is possible to have a clean history. So I would be fine to do a
>> portion of the legwork and extract from this patch something smaller
>> that adds only functions as a first step, with the minimum set of
>> functions I mentioned upthread. Robert, Alvaro, Fabien, does that
>> sound fine to you?
>
> I'd be delighted.  I would really like to get this feature in, but I'm
> not going to do it if it requires an unreasonable amount of work on my
> part - and what you propose would help a lot.

OK, I'll see about producing a patch then for this basic
infrastructure, with the rest built on top of it as a secondary patch.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-02-13 Thread Robert Haas

On Sat, Feb 13, 2016 at 10:37 AM, Fabien COELHO  wrote:
> A real pain is the documentation, because it means writing a documentation
> with only integer functions, then overwriting it with doubles. This is dumb
> work, really, for the sake of "a cleaner git history", the beauty of it no
> one will ever contemplate...

You know, you make comments like this pretty much every time anybody
suggests that you should change anything in any patch you write.  It
doesn't matter whether the change is suggested by Heikki, or by
Michael, or by Andres, or by me.  You pretty much always come back and
say something that amounts to "changing the patch I already wrote is a
waste of time".  That gets a little disheartening after a while.  This
community's procedure is that patches have to be reviewed and reach
consensus in order to get committed, and in my opinion that is
generally not "dumb" but rather something that enhances the end
result.  I value your contributions to the community and I hope you
will continue making them, but I don't like it when people call my
ideas dumb.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-02-13 Thread Robert Haas

On Sat, Feb 13, 2016 at 6:19 PM, Michael Paquier
 wrote:
> On Sun, Feb 14, 2016 at 12:37 AM, Fabien COELHO wrote:
>> The two features are highly intermix, so it can only be dependent patches,
>> first to add a function infrastructure and probably some support for doubles
>> altough it would not be used, then to add doubles & their functions.
>>
>> A real pain is the documentation, because it means writing a documentation
>> with only integer functions, then overwriting it with doubles. This is dumb
>> work, really, for the sake of "a cleaner git history", the beauty of it no
>> one will ever contemplate...
>
> FWIW, I care a lot about splitting as much as possible patches where
> it is possible to have a clean history. So I would be fine to do a
> portion of the legwork and extract from this patch something smaller
> that adds only functions as a first step, with the minimum set of
> functions I mentioned upthread. Robert, Alvaro, Fabien, does that
> sound fine to you?

I'd be delighted.  I would really like to get this feature in, but I'm
not going to do it if it requires an unreasonable amount of work on my
part - and what you propose would help a lot.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-02-13 Thread Michael Paquier

On Sun, Feb 14, 2016 at 12:37 AM, Fabien COELHO wrote:
> The two features are highly intermix, so it can only be dependent patches,
> first to add a function infrastructure and probably some support for doubles
> altough it would not be used, then to add doubles & their functions.
>
> A real pain is the documentation, because it means writing a documentation
> with only integer functions, then overwriting it with doubles. This is dumb
> work, really, for the sake of "a cleaner git history", the beauty of it no
> one will ever contemplate...

FWIW, I care a lot about splitting as much as possible patches where
it is possible to have a clean history. So I would be fine to do a
portion of the legwork and extract from this patch something smaller
that adds only functions as a first step, with the minimum set of
functions I mentioned upthread. Robert, Alvaro, Fabien, does that
sound fine to you?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-02-13 Thread Fabien COELHO




But variables can still only be integers.


Version 28 attached has double variables, although typing is based on 
guessing because it is just a string.


Any comment?

[...] two separate enhancements in here.  One of them is to support a 
new data type (doubles) and the other is to support functions.


The two features are highly intermix, so it can only be dependent patches, 
first to add a function infrastructure and probably some support for 
doubles altough it would not be used, then to add doubles & their 
functions.


A real pain is the documentation, because it means writing a documentation 
with only integer functions, then overwriting it with doubles. This is 
dumb work, really, for the sake of "a cleaner git history", the beauty of 
it no one will ever contemplate...


--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index ade1b53..5d38c6f 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -793,20 +793,24 @@ pgbench  options  dbname
 
 
  
-  Sets variable varname to an integer value calculated
+  Sets variable varname to a value calculated
   from expression.
   The expression may contain integer constants such as 5432,
+  double constants such as 3.14159,
   references to variables :variablename,
   and expressions composed of unary (-) or binary operators
-  (+, -, *, /, %)
-  with their usual associativity, and parentheses.
+  (+, -, *, /,
+  %) with their usual associativity, function calls and
+  parentheses.
+   shows the available
+  functions.
  
 
  
   Examples:
 
 \set ntellers 10 * :scale
-\set aid (1021 * :aid) % (10 * :scale) + 1
+\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1
 
 

@@ -826,66 +830,35 @@ pgbench  options  dbname
  
 
  
-  By default, or when uniform is specified, all values in the
-  range are drawn with equal probability.  Specifying gaussian
-  or  exponential options modifies this behavior; each
-  requires a mandatory parameter which determines the precise shape of the
-  distribution.
- 
+  
+   
+
+ \setrandom n 1 10 or \setrandom n 1 10 uniform
+ is equivalent to \set n random(1, 10) and uses a uniform
+ distribution.
+
+   
 
- 
-  For a Gaussian distribution, the interval is mapped onto a standard
-  normal distribution (the classical bell-shaped Gaussian curve) truncated
-  at -parameter on the left and +parameter
-  on the right.
-  Values in the middle of the interval are more likely to be drawn.
-  To be precise, if PHI(x) is the cumulative distribution
-  function of the standard normal distribution, with mean mu
-  defined as (max + min) / 2.0, with
-
- f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
-(2.0 * PHI(parameter) - 1.0)
-
-  then value i between min and
-  max inclusive is drawn with probability:
-  f(i + 0.5) - f(i - 0.5).
-  Intuitively, the larger parameter, the more
-  frequently values close to the middle of the interval are drawn, and the
-  less frequently values close to the min and
-  max bounds. About 67% of values are drawn from the
-  middle 1.0 / parameter, that is a relative
-  0.5 / parameter around the mean, and 95% in the middle
-  2.0 / parameter, that is a relative
-  1.0 / parameter around the mean; for instance, if
-  parameter is 4.0, 67% of values are drawn from the
-  middle quarter (1.0 / 4.0) of the interval (i.e. from
-  3.0 / 8.0 to 5.0 / 8.0) and 95% from
-  the middle half (2.0 / 4.0) of the interval (second and
-  third quartiles). The minimum parameter is 2.0 for
-  performance of the Box-Muller transform.
- 
+  
+   
+\setrandom n 1 10 exponential 3.0 is equivalent to
+\set n random_exponential(1, 10, 3.0) and uses an
+exponential distribution.
+   
+  
 
- 
-  For an exponential distribution, parameter
-  controls the distribution by truncating a quickly-decreasing
-  exponential distribution at parameter, and then
-  projecting onto integers between the bounds.
-  To be precise, with
-
-f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter))
-
-  Then value i between min and
-  max inclusive is drawn with probability:
-  f(x) - f(x + 1).
-  Intuitively, the larger parameter, the more
-  frequently values close to min are accessed, and the
-  less frequently values close to max are accessed.
-  The closer to 0 parameter, the flatter (more uniform)
-  the access distribution.
-  A crude approximation of the distribution is that the most frequent 1%
-  values in the range, close to min, are drawn
-  parameter% of the time.
-  parameter value must be strictly positive.
+  
+   
+

Re: [HACKERS] extend pgbench expressions with functions

2016-02-13 Thread Fabien COELHO



Hello Michaël,


 - if the function & double stuff are separated ?
 - for the double part, if variables can be double ?


I just double-checked and could not see a clear use case mentioned in
this thread for double return types,


Alas there is one: non uniform random functions which use a double 
parameter.


Once random functions are there, the \setrandom horror code could be 
removed, which would be a real benefit, IMO:-)


So I see a good case to have some support for doubles.

so I would suggest focusing on the integer portion with min(), max(), 
abs(), debug() and the existing functions refactored. That's what your 
first versions did. If someone is wishing to implement double types, 
this someone could do it, the infrastructure that this patch puts in 
place has already proved that it can be easily extended.


Adding double is not too big a deal. I just stopped at variables because I 
could not see any realistic use for them. My idea was to postpone that 
till it is actually needed, "never" being the most probable course.


Now if this is a blocker for the committer, then I will probably make 
the effort whatever I think of the usefulness of the feature.


--
Fabien.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-02-13 Thread Michael Paquier

On Sat, Feb 13, 2016 at 4:37 PM, Fabien COELHO  wrote:
>
>> For example, I just realized that this patch allows values to be
>> either a double or an integer and extends the operators to handle
>> double values.  But variables can still only be integers.
>
> Indeed.

That's exactly the first impression I got about this patch when sending [1].
[1]: 
http://www.postgresql.org/message-id/cab7npqrhvfvnrahamaroniedec90pf3-wn+u5ccb4red-ht...@mail.gmail.com
By focusing only on integers the patch would largely gain in
simplicity. Now being able to have double values is actually useful
for nested function calls, nothing else.

>> I don't think variables should be explicitly typed but they should be able
>> to store a value of any type that expression evaluation can generate.
>
> Doubles are not really needed that much, it is just to provide something to
> random_* functions parameter, otherwise it is useless as far as pgbench is
> really concerned.

+1.

>> Also, as I said back in November, there's really two completely
>> separate enhancements in here.  One of them is to support a new data
>> type (doubles) and the other is to support functions.
>
> Yep. The first part is precisely the patch I initially submitted 5 CF ago.
> Then I'm asked to put more things in it to show that it can indeed handle
> another type. Then I'm told "you should not have done that". What can I say?

I think that you did your job here by answering the comments of all
the reviewers that had a look at this patch. That's not an easy task.

>> [...] I find implementing operators as functions in disguise not to be one
>> of PostgreSQL's most awesome design decisions, and here we are copying that
>> into pgbench for, well, no benefit that I can see, really.
>
> Well, I did that initially, then I was asked to implements operators as
> functions. It probably saves some lines, so it is not too bad, even if the
> benefit is limited.

Put the blame on me for this one then. I suggested this idea because
Postgres is doing the same, and because it simplifies slightly the
union structure in charge of holding the parsed structures, making the
code a bit more readable IMO.

>> [...] If neither of you are willing to split this patch, I'm not willing
>> to commit it.
>
> If I'm reading you correctly, you would consider committing it:
>
>  - if the function & double stuff are separated ?
>  - for the double part, if variables can be double ?

I just double-checked and could not see a clear use case mentioned in
this thread for double return types, so I would suggest focusing on
the integer portion with min(), max(), abs(), debug() and the existing
functions refactored. That's what your first versions did. If someone
is wishing to implement double types, this someone could do it, the
infrastructure that this patch puts in place has already proved that
it can be easily extended.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-02-12 Thread Fabien COELHO



Hello Robert,


For example, I just realized that this patch allows values to be
either a double or an integer and extends the operators to handle
double values.  But variables can still only be integers.


Indeed.

[...] at least flatten everything to string rather than integer so that 
you can store the value without loss of precision - just with loss of 
type-safety.  I think designing this in this way is quite short-sighted.


Note that I'm not responsible for this design, which is preexisting. 
Extending variables to be able to store doubles could also be done in 
another patch.


I don't think variables should be explicitly typed but they should be 
able to store a value of any type that expression evaluation can 
generate.


Doubles are not really needed that much, it is just to provide something 
to random_* functions parameter, otherwise it is useless as far as pgbench 
is really concerned.



Also, as I said back in November, there's really two completely
separate enhancements in here.  One of them is to support a new data
type (doubles) and the other is to support functions.


Yep. The first part is precisely the patch I initially submitted 5 CF ago.

Then I'm asked to put more things in it to show that it can indeed handle 
another type. Then I'm told "you should not have done that". What can I 
say?



Those should really be separate patches.


They could.

[...] I find implementing operators as functions in disguise not to be 
one of PostgreSQL's most awesome design decisions, and here we are 
copying that into pgbench for, well, no benefit that I can see, really.


Well, I did that initially, then I was asked to implements operators as 
functions. It probably saves some lines, so it is not too bad, even if the 
benefit is limited.



Maybe it's a good idea and maybe it's a bad idea, but how do we know?


This is just pgbench, a tool for testing performance by running dummy 
transactions, not a production thing, so I think that it really does not 
matter. There is no user visible changes wrt operators.


[...] If neither of you are willing to split this patch, I'm not willing 
to commit it.


If I'm reading you correctly, you would consider committing it:

 - if the function & double stuff are separated ?

 - for the double part, if variables can be double ?

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-02-12 Thread Robert Haas

On Fri, Feb 12, 2016 at 5:06 AM, Fabien COELHO  wrote:
> I think that this option is too much bother for the small internal purpose
> at hand.

Yeah, I'm really frustrated with this whole thread.  There's been a
huge amount of discussion on this patch, but I don't really feel like
it's converging towards something I could actually commit.

For example, I just realized that this patch allows values to be
either a double or an integer and extends the operators to handle
double values.  But variables can still only be integers.  Please name
any programming language that has such a restriction.  The only ones I
can think of are command shells, and those at least flatten everything
to string rather than integer so that you can store the value without
loss of precision - just with loss of type-safety.  I think designing
this in this way is quite short-sighted.  I don't think variables
should be explicitly typed but they should be able to store a value of
any type that expression evaluation can generate.

Also, as I said back in November, there's really two completely
separate enhancements in here.  One of them is to support a new data
type (doubles) and the other is to support functions.  Those should
really be separate patches.  If neither of you are willing to split
this patch, I'm not willing to commit it.  Going over half of this
patch at a time and getting it committed is going to be a lot of work
for me, but I'm willing to do it.  I'm not willing to go over the
whole thing at once - that's going to take more time than I have, and
produce what will in my opinion be an inferior commit history.  If
somebody else is willing to commit the whole thing as one patch, I'm
not going to object, but I won't do it myself.

I would not worry too much about the list thing at this point.  I'm
sure something simple is fine for that.  I actually think the patch is
now in decent shape as far as code-cleanliness is concerned, but I'm
not sure we've really looked hard enough at the design.  For example,
I find implementing operators as functions in disguise not to be one
of PostgreSQL's most awesome design decisions, and here we are copying
that into pgbench for, well, no benefit that I can see, really.  Maybe
it's a good idea and maybe it's a bad idea, but how do we know?  That
stuff deserves more discussion.  Code cleanup is good, so I do think
it's good that a lot of effort has been put in there, but I don't
think more code cleanup is what's going to get this patch over the
finish line.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-02-12 Thread Fabien COELHO



Hello Michaël,


Using a pointer to the tail of the list would make the code simple,
and save a couple of lines.


I did that, see v27 attached.

Note that it does not save any lines, because the reverse function is 
removed, but then you need another type to keep the head & tail, the link 
type is not enough, and then you have to manage that stuff in the code. 
Whether it is "simpler" is debatable. It probably costs more tests when 
executed.


However, it really saves having to answer the question "why is the list 
reversed?", which is definitely a win from my point of view:-)



Another thing that could be considered is also to move list.c [...]


I think that this option is too much bother for the small internal purpose 
at hand.


--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index ade1b53..eaa0889 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -796,17 +796,21 @@ pgbench  options  dbname
   Sets variable varname to an integer value calculated
   from expression.
   The expression may contain integer constants such as 5432,
-  references to variables :variablename,
+  double constants such as 3.14159,
+  references to integer variables :variablename,
   and expressions composed of unary (-) or binary operators
-  (+, -, *, /, %)
-  with their usual associativity, and parentheses.
+  (+, -, *, /,
+  %) with their usual associativity, function calls and
+  parentheses.
+   shows the available
+  functions.
  
 
  
   Examples:
 
 \set ntellers 10 * :scale
-\set aid (1021 * :aid) % (10 * :scale) + 1
+\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1
 
 

@@ -826,66 +830,35 @@ pgbench  options  dbname
  
 
  
-  By default, or when uniform is specified, all values in the
-  range are drawn with equal probability.  Specifying gaussian
-  or  exponential options modifies this behavior; each
-  requires a mandatory parameter which determines the precise shape of the
-  distribution.
- 
+  
+   
+
+ \setrandom n 1 10 or \setrandom n 1 10 uniform
+ is equivalent to \set n random(1, 10) and uses a uniform
+ distribution.
+
+   
 
- 
-  For a Gaussian distribution, the interval is mapped onto a standard
-  normal distribution (the classical bell-shaped Gaussian curve) truncated
-  at -parameter on the left and +parameter
-  on the right.
-  Values in the middle of the interval are more likely to be drawn.
-  To be precise, if PHI(x) is the cumulative distribution
-  function of the standard normal distribution, with mean mu
-  defined as (max + min) / 2.0, with
-
- f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
-(2.0 * PHI(parameter) - 1.0)
-
-  then value i between min and
-  max inclusive is drawn with probability:
-  f(i + 0.5) - f(i - 0.5).
-  Intuitively, the larger parameter, the more
-  frequently values close to the middle of the interval are drawn, and the
-  less frequently values close to the min and
-  max bounds. About 67% of values are drawn from the
-  middle 1.0 / parameter, that is a relative
-  0.5 / parameter around the mean, and 95% in the middle
-  2.0 / parameter, that is a relative
-  1.0 / parameter around the mean; for instance, if
-  parameter is 4.0, 67% of values are drawn from the
-  middle quarter (1.0 / 4.0) of the interval (i.e. from
-  3.0 / 8.0 to 5.0 / 8.0) and 95% from
-  the middle half (2.0 / 4.0) of the interval (second and
-  third quartiles). The minimum parameter is 2.0 for
-  performance of the Box-Muller transform.
- 
+  
+   
+\setrandom n 1 10 exponential 3.0 is equivalent to
+\set n random_exponential(1, 10, 3.0) and uses an
+exponential distribution.
+   
+  
 
- 
-  For an exponential distribution, parameter
-  controls the distribution by truncating a quickly-decreasing
-  exponential distribution at parameter, and then
-  projecting onto integers between the bounds.
-  To be precise, with
-
-f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter))
-
-  Then value i between min and
-  max inclusive is drawn with probability:
-  f(x) - f(x + 1).
-  Intuitively, the larger parameter, the more
-  frequently values close to min are accessed, and the
-  less frequently values close to max are accessed.
-  The closer to 0 parameter, the flatter (more uniform)
-  the access distribution.
-  A crude approximation of the distribution is that the most frequent 1%
-  values in the range, close to min, are drawn
-  parameter% of the time.
-  parameter value must be strictly positive.
+  
+   
+\setrandom n 1 10 gaussian 2.0 is equivalent to
+

Re: [HACKERS] extend pgbench expressions with functions

2016-02-11 Thread Michael Paquier

On Fri, Feb 12, 2016 at 2:41 AM, Fabien COELHO  wrote:
>
> Hello Michaël,
>
>> +/* the argument list has been built in reverse order, it is fixed
>> here */
>> +expr->u.function.args = reverse_elist(args);
>> Hm. I may be missing something, but why is that necessary? This is
>> basically doing a double-reversion to put all the arguments in the
>> correct order when parsing the function arguments.
>
> This is because the expression list is parsed left to right and the list is
> built as a stack to avoid looking for the last argument to append the next
> expression, but then the list is in reverse order at the end of parsing, so
> it is reversed once to make it right. This way the complexity is kept as
> O(n).
>
> If this is too much I can switch to O(n**2) by appending each expression at
> the end of the list.

(this one has been mentioned by Alvaro offlist)
Using a pointer to the tail of the list would make the code simple,
and save a couple of lines.

Another thing that could be considered is also to move list.c and
pg_list.h into src/common and reuse that. There are other frontend
utilities that emulate the same kind of facilities, have a look for
example at the other copycats in pg_dump and pg_basebackup.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-02-11 Thread Fabien COELHO



Hello Michaël,


+/* the argument list has been built in reverse order, it is fixed here */
+expr->u.function.args = reverse_elist(args);
Hm. I may be missing something, but why is that necessary? This is
basically doing a double-reversion to put all the arguments in the
correct order when parsing the function arguments.


This is because the expression list is parsed left to right and the list 
is built as a stack to avoid looking for the last argument to append the 
next expression, but then the list is in reverse order at the end of 
parsing, so it is reversed once to make it right. This way the complexity 
is kept as O(n).


If this is too much I can switch to O(n**2) by appending each expression 
at the end of the list.


--
Fabien.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-02-11 Thread Michael Paquier

On Thu, Feb 11, 2016 at 12:37 AM, Fabien COELHO  wrote:
> v26 attached implements these changes.

+/* the argument list has been built in reverse order, it is fixed here */
+expr->u.function.args = reverse_elist(args);
Hm. I may be missing something, but why is that necessary? This is
basically doing a double-reversion to put all the arguments in the
correct order when parsing the function arguments.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-02-10 Thread Fabien COELHO



Hello Michaël,


+  parameter%  of the time.
Nitpick: double space here.


Ok.


+   default:
+   return false;
   }
In evalFunc(), the default case in switch for the operator functions
should never be reached. Adding for example Assert(0) is something to
consider.


Ok for Assert + a comment.


PGBT_NONE and PGBENCH_NONE are used nowhere. Why not removing them


Ok.

v26 attached implements these changes.

--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index ade1b53..eaa0889 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -796,17 +796,21 @@ pgbench  options  dbname
   Sets variable varname to an integer value calculated
   from expression.
   The expression may contain integer constants such as 5432,
-  references to variables :variablename,
+  double constants such as 3.14159,
+  references to integer variables :variablename,
   and expressions composed of unary (-) or binary operators
-  (+, -, *, /, %)
-  with their usual associativity, and parentheses.
+  (+, -, *, /,
+  %) with their usual associativity, function calls and
+  parentheses.
+   shows the available
+  functions.
  
 
  
   Examples:
 
 \set ntellers 10 * :scale
-\set aid (1021 * :aid) % (10 * :scale) + 1
+\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1
 
 

@@ -826,66 +830,35 @@ pgbench  options  dbname
  
 
  
-  By default, or when uniform is specified, all values in the
-  range are drawn with equal probability.  Specifying gaussian
-  or  exponential options modifies this behavior; each
-  requires a mandatory parameter which determines the precise shape of the
-  distribution.
- 
+  
+   
+
+ \setrandom n 1 10 or \setrandom n 1 10 uniform
+ is equivalent to \set n random(1, 10) and uses a uniform
+ distribution.
+
+   
 
- 
-  For a Gaussian distribution, the interval is mapped onto a standard
-  normal distribution (the classical bell-shaped Gaussian curve) truncated
-  at -parameter on the left and +parameter
-  on the right.
-  Values in the middle of the interval are more likely to be drawn.
-  To be precise, if PHI(x) is the cumulative distribution
-  function of the standard normal distribution, with mean mu
-  defined as (max + min) / 2.0, with
-
- f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
-(2.0 * PHI(parameter) - 1.0)
-
-  then value i between min and
-  max inclusive is drawn with probability:
-  f(i + 0.5) - f(i - 0.5).
-  Intuitively, the larger parameter, the more
-  frequently values close to the middle of the interval are drawn, and the
-  less frequently values close to the min and
-  max bounds. About 67% of values are drawn from the
-  middle 1.0 / parameter, that is a relative
-  0.5 / parameter around the mean, and 95% in the middle
-  2.0 / parameter, that is a relative
-  1.0 / parameter around the mean; for instance, if
-  parameter is 4.0, 67% of values are drawn from the
-  middle quarter (1.0 / 4.0) of the interval (i.e. from
-  3.0 / 8.0 to 5.0 / 8.0) and 95% from
-  the middle half (2.0 / 4.0) of the interval (second and
-  third quartiles). The minimum parameter is 2.0 for
-  performance of the Box-Muller transform.
- 
+  
+   
+\setrandom n 1 10 exponential 3.0 is equivalent to
+\set n random_exponential(1, 10, 3.0) and uses an
+exponential distribution.
+   
+  
 
- 
-  For an exponential distribution, parameter
-  controls the distribution by truncating a quickly-decreasing
-  exponential distribution at parameter, and then
-  projecting onto integers between the bounds.
-  To be precise, with
-
-f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter))
-
-  Then value i between min and
-  max inclusive is drawn with probability:
-  f(x) - f(x + 1).
-  Intuitively, the larger parameter, the more
-  frequently values close to min are accessed, and the
-  less frequently values close to max are accessed.
-  The closer to 0 parameter, the flatter (more uniform)
-  the access distribution.
-  A crude approximation of the distribution is that the most frequent 1%
-  values in the range, close to min, are drawn
-  parameter% of the time.
-  parameter value must be strictly positive.
+  
+   
+\setrandom n 1 10 gaussian 2.0 is equivalent to
+\set n random_gaussian(1, 10, 2.0), and uses a gaussian
+distribution.
+   
+  
+ 
+
+   See the documentation of these functions below for further information
+   about the precise shape of these distributions, depending on the value
+   of the para

Re: [HACKERS] extend pgbench expressions with functions

2016-02-10 Thread Michael Paquier

On Tue, Feb 9, 2016 at 5:06 AM, Alvaro Herrera  wrote:
> Fabien COELHO wrote:
>>
>> Hello,
>>
>> >>v23 attached, which does not change the message but does the other fixes.
>> >
>> >This doesn't apply anymore
>>
>> Indeed, but the latest version was really v25.
>>
>> >-- please rebase and submit to the next CF.
>>
>> I already provided it as v25 on Feb 1st.
>>
>> >I closed it here as returned with feedback.
>>
>> I turned it to "moved to next CF" as the patch is already on the queue.
>
> Ah, I didn't see v25 anywhere.  What you did should be fine, thanks.

I just had another look at this patch.

+  parameter%  of the time.
Nitpick: double space here.


+   switch (func)
{
[...]
+   }
+   default:
+   return false;
}
In evalFunc(), the default case in switch for the operator functions
should never be reached. Adding for example Assert(0) is something to
consider.

PGBT_NONE and PGBENCH_NONE are used nowhere. Why not removing them or
replace the code paths where they would be used by assertions to
trigger errors for future developments?

Other than that the patch looks in pretty good shape to me.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-02-08 Thread Alvaro Herrera

Fabien COELHO wrote:
> 
> Hello,
> 
> >>v23 attached, which does not change the message but does the other fixes.
> >
> >This doesn't apply anymore
> 
> Indeed, but the latest version was really v25.
> 
> >-- please rebase and submit to the next CF.
> 
> I already provided it as v25 on Feb 1st.
> 
> >I closed it here as returned with feedback.
> 
> I turned it to "moved to next CF" as the patch is already on the queue.

Ah, I didn't see v25 anywhere.  What you did should be fine, thanks.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-02-08 Thread Fabien COELHO



Hello,


v23 attached, which does not change the message but does the other fixes.


This doesn't apply anymore


Indeed, but the latest version was really v25.


-- please rebase and submit to the next CF.


I already provided it as v25 on Feb 1st.


I closed it here as returned with feedback.


I turned it to "moved to next CF" as the patch is already on the queue.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-02-08 Thread Alvaro Herrera

Fabien COELHO wrote:
> 
> Hello Michaël,
> 
> v23 attached, which does not change the message but does the other fixes.

This doesn't apply anymore -- please rebase and submit to the next CF.
I closed it here as returned with feedback.

Thanks!

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-02-03 Thread Michael Paquier

On Wed, Feb 3, 2016 at 11:28 PM, Robert Haas  wrote:
> On Mon, Feb 1, 2016 at 9:46 PM, Michael Paquier
>  wrote:
>> OK, here are patches for 9.1~9.4. The main differences are that in
>> 9.3/9.4 int64 is used for the division operations, and in 9.2/9.1
>> that's int32. In the latter case pgbench blows up the same way with
>> that:
>> \set i -2147483648
>> \set i :i / -1
>> select :i;
>> In those patches INT32_MIN/INT64_MIN need to be explicitly set as well
>> at the top of pgbench.c. I thing that's fine.
>
> I thing so too.  Committed.

Thanks for thinging so.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-02-03 Thread Robert Haas

On Mon, Feb 1, 2016 at 9:46 PM, Michael Paquier
 wrote:
> OK, here are patches for 9.1~9.4. The main differences are that in
> 9.3/9.4 int64 is used for the division operations, and in 9.2/9.1
> that's int32. In the latter case pgbench blows up the same way with
> that:
> \set i -2147483648
> \set i :i / -1
> select :i;
> In those patches INT32_MIN/INT64_MIN need to be explicitly set as well
> at the top of pgbench.c. I thing that's fine.

I thing so too.  Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-02-01 Thread Michael Paquier

On Tue, Feb 2, 2016 at 1:35 PM, Michael Paquier
 wrote:
> And now there are patches. Well, nobody has complained about that until now 
> except me... So we could live without patching back-branches, but it don't 
> think it hurts much to fix those holes.

Meh, s/it don't/I don't/
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-02-01 Thread Michael Paquier

On Tue, Feb 2, 2016 at 1:24 PM, Robert Haas  wrote:

> On Mon, Feb 1, 2016 at 9:46 PM, Michael Paquier
>  wrote:
> > On Mon, Feb 1, 2016 at 10:34 PM, Robert Haas 
> wrote:
> >> On Sat, Jan 30, 2016 at 7:36 AM, Michael Paquier
> >>  wrote:
> >>> On Fri, Jan 29, 2016 at 11:21 PM, Fabien COELHO 
> wrote:
>  +/* overflow check (needed for INT64_MIN) */
>  +if (lval != 0 && (*retval < 0 == lval < 0))
> 
>  Why not use "if (lval == INT64_MIN)" instead of this complicated
> condition?
>  If it is really needed for some reason, I think that a comment could
> help.
> >>>
> >>> Checking for PG_INT64_MIN only would be fine as well, so let's do so.
> >>> I thought honestly that we had better check if the result and the left
> >>> argument are not of the same sign, but well.
> >>
> >> Committed and back-patched to 9.5.  Doesn't apply further back.
> >
> > OK, here are patches for 9.1~9.4. The main differences are that in
> > 9.3/9.4 int64 is used for the division operations, and in 9.2/9.1
> > that's int32. In the latter case pgbench blows up the same way with
> > that:
> > \set i -2147483648
> > \set i :i / -1
> > select :i;
> > In those patches INT32_MIN/INT64_MIN need to be explicitly set as well
> > at the top of pgbench.c. I thing that's fine.
>
> Oh, gosh, I should have said more clearly that I didn't really see a
> need to fix this all the way back.  But I guess we could.
>

And now there are patches. Well, nobody has complained about that until now
except me... So we could live without patching back-branches, but it don't
think it hurts much to fix those holes.
-- 
Michael

Re: [HACKERS] extend pgbench expressions with functions

2016-02-01 Thread Robert Haas

On Mon, Feb 1, 2016 at 9:46 PM, Michael Paquier
 wrote:
> On Mon, Feb 1, 2016 at 10:34 PM, Robert Haas  wrote:
>> On Sat, Jan 30, 2016 at 7:36 AM, Michael Paquier
>>  wrote:
>>> On Fri, Jan 29, 2016 at 11:21 PM, Fabien COELHO  wrote:
 +/* overflow check (needed for INT64_MIN) */
 +if (lval != 0 && (*retval < 0 == lval < 0))

 Why not use "if (lval == INT64_MIN)" instead of this complicated condition?
 If it is really needed for some reason, I think that a comment could help.
>>>
>>> Checking for PG_INT64_MIN only would be fine as well, so let's do so.
>>> I thought honestly that we had better check if the result and the left
>>> argument are not of the same sign, but well.
>>
>> Committed and back-patched to 9.5.  Doesn't apply further back.
>
> OK, here are patches for 9.1~9.4. The main differences are that in
> 9.3/9.4 int64 is used for the division operations, and in 9.2/9.1
> that's int32. In the latter case pgbench blows up the same way with
> that:
> \set i -2147483648
> \set i :i / -1
> select :i;
> In those patches INT32_MIN/INT64_MIN need to be explicitly set as well
> at the top of pgbench.c. I thing that's fine.

Oh, gosh, I should have said more clearly that I didn't really see a
need to fix this all the way back.  But I guess we could.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-02-01 Thread Michael Paquier

On Mon, Feb 1, 2016 at 10:34 PM, Robert Haas  wrote:
> On Sat, Jan 30, 2016 at 7:36 AM, Michael Paquier
>  wrote:
>> On Fri, Jan 29, 2016 at 11:21 PM, Fabien COELHO  wrote:
>>> +/* overflow check (needed for INT64_MIN) */
>>> +if (lval != 0 && (*retval < 0 == lval < 0))
>>>
>>> Why not use "if (lval == INT64_MIN)" instead of this complicated condition?
>>> If it is really needed for some reason, I think that a comment could help.
>>
>> Checking for PG_INT64_MIN only would be fine as well, so let's do so.
>> I thought honestly that we had better check if the result and the left
>> argument are not of the same sign, but well.
>
> Committed and back-patched to 9.5.  Doesn't apply further back.

OK, here are patches for 9.1~9.4. The main differences are that in
9.3/9.4 int64 is used for the division operations, and in 9.2/9.1
that's int32. In the latter case pgbench blows up the same way with
that:
\set i -2147483648
\set i :i / -1
select :i;
In those patches INT32_MIN/INT64_MIN need to be explicitly set as well
at the top of pgbench.c. I thing that's fine.
-- 
Michael
diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index 2111f16..84b6303 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -56,6 +56,10 @@
 #ifndef INT64_MAX
 #define INT64_MAX	INT64CONST(0x7FFF)
 #endif
+#ifndef INT32_MIN
+#define INT32_MIN	(-0x7FFF-1)
+#endif
+
 
 /*
  * Multi-platform pthread implementations
@@ -1152,13 +1156,37 @@ top:
 	snprintf(res, sizeof(res), "%d", ope1 * ope2);
 else if (strcmp(argv[3], "/") == 0)
 {
+	int		operes;
+
 	if (ope2 == 0)
 	{
 		fprintf(stderr, "%s: division by zero\n", argv[0]);
 		st->ecnt++;
 		return true;
 	}
-	snprintf(res, sizeof(res), "%d", ope1 / ope2);
+	/*
+	 * INT32_MIN / -1 is problematic, since the result can't
+	 * be represented on a two's-complement machine. Some
+	 * machines produce INT32_MIN, some produce zero, some
+	 * throw an exception. We can dodge the problem by
+	 * recognizing that division by -1 is the same as
+	 * negation.
+	 */
+	if (ope2 == -1)
+	{
+		operes = -ope1;
+
+		/* overflow check (needed for INT32_MIN) */
+		if (ope1 == INT32_MIN)
+		{
+			fprintf(stderr, "integer out of range\n");
+			st->ecnt++;
+			return true;
+		}
+	}
+	else
+		operes = ope1 / ope2;
+	snprintf(res, sizeof(res), "%d", operes);
 }
 else
 {
diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index 4e22695..062a32f 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -57,6 +57,10 @@
 #ifndef INT64_MAX
 #define INT64_MAX	INT64CONST(0x7FFF)
 #endif
+#ifndef INT64_MIN
+#define INT64_MIN	(-INT64CONST(0x7FFF) - 1)
+#endif
+
 
 /*
  * Multi-platform pthread implementations
@@ -1331,13 +1335,37 @@ top:
 	snprintf(res, sizeof(res), INT64_FORMAT, ope1 * ope2);
 else if (strcmp(argv[3], "/") == 0)
 {
+	int64	operes;
+
 	if (ope2 == 0)
 	{
 		fprintf(stderr, "%s: division by zero\n", argv[0]);
 		st->ecnt++;
 		return true;
 	}
-	snprintf(res, sizeof(res), INT64_FORMAT, ope1 / ope2);
+	/*
+	 * INT64_MIN / -1 is problematic, since the result can't
+	 * be represented on a two's-complement machine. Some
+	 * machines produce INT64_MIN, some produce zero, some
+	 * throw an exception. We can dodge the problem by
+	 * recognizing that division by -1 is the same as
+	 * negation.
+	 */
+	if (ope2 == -1)
+	{
+		operes = -ope1;
+
+		/* overflow check (needed for INT64_MIN) */
+		if (ope1 == INT64_MIN)
+		{
+			fprintf(stderr, "bigint out of range\n");
+			st->ecnt++;
+			return true;
+		}
+	}
+	else
+		operes = ope1 / ope2;
+	snprintf(res, sizeof(res), INT64_FORMAT, operes);
 }
 else
 {

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-02-01 Thread Fabien COELHO



Yet another rebase, so as to propagate the same special case checks int 
for / and %.


--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index ade1b53..ebaf4e5 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -796,17 +796,21 @@ pgbench  options  dbname
   Sets variable varname to an integer value calculated
   from expression.
   The expression may contain integer constants such as 5432,
-  references to variables :variablename,
+  double constants such as 3.14159,
+  references to integer variables :variablename,
   and expressions composed of unary (-) or binary operators
-  (+, -, *, /, %)
-  with their usual associativity, and parentheses.
+  (+, -, *, /,
+  %) with their usual associativity, function calls and
+  parentheses.
+   shows the available
+  functions.
  
 
  
   Examples:
 
 \set ntellers 10 * :scale
-\set aid (1021 * :aid) % (10 * :scale) + 1
+\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1
 
 

@@ -826,66 +830,35 @@ pgbench  options  dbname
  
 
  
-  By default, or when uniform is specified, all values in the
-  range are drawn with equal probability.  Specifying gaussian
-  or  exponential options modifies this behavior; each
-  requires a mandatory parameter which determines the precise shape of the
-  distribution.
- 
+  
+   
+
+ \setrandom n 1 10 or \setrandom n 1 10 uniform
+ is equivalent to \set n random(1, 10) and uses a uniform
+ distribution.
+
+   
 
- 
-  For a Gaussian distribution, the interval is mapped onto a standard
-  normal distribution (the classical bell-shaped Gaussian curve) truncated
-  at -parameter on the left and +parameter
-  on the right.
-  Values in the middle of the interval are more likely to be drawn.
-  To be precise, if PHI(x) is the cumulative distribution
-  function of the standard normal distribution, with mean mu
-  defined as (max + min) / 2.0, with
-
- f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
-(2.0 * PHI(parameter) - 1.0)
-
-  then value i between min and
-  max inclusive is drawn with probability:
-  f(i + 0.5) - f(i - 0.5).
-  Intuitively, the larger parameter, the more
-  frequently values close to the middle of the interval are drawn, and the
-  less frequently values close to the min and
-  max bounds. About 67% of values are drawn from the
-  middle 1.0 / parameter, that is a relative
-  0.5 / parameter around the mean, and 95% in the middle
-  2.0 / parameter, that is a relative
-  1.0 / parameter around the mean; for instance, if
-  parameter is 4.0, 67% of values are drawn from the
-  middle quarter (1.0 / 4.0) of the interval (i.e. from
-  3.0 / 8.0 to 5.0 / 8.0) and 95% from
-  the middle half (2.0 / 4.0) of the interval (second and
-  third quartiles). The minimum parameter is 2.0 for
-  performance of the Box-Muller transform.
- 
+  
+   
+\setrandom n 1 10 exponential 3.0 is equivalent to
+\set n random_exponential(1, 10, 3.0) and uses an
+exponential distribution.
+   
+  
 
- 
-  For an exponential distribution, parameter
-  controls the distribution by truncating a quickly-decreasing
-  exponential distribution at parameter, and then
-  projecting onto integers between the bounds.
-  To be precise, with
-
-f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter))
-
-  Then value i between min and
-  max inclusive is drawn with probability:
-  f(x) - f(x + 1).
-  Intuitively, the larger parameter, the more
-  frequently values close to min are accessed, and the
-  less frequently values close to max are accessed.
-  The closer to 0 parameter, the flatter (more uniform)
-  the access distribution.
-  A crude approximation of the distribution is that the most frequent 1%
-  values in the range, close to min, are drawn
-  parameter% of the time.
-  parameter value must be strictly positive.
+  
+   
+\setrandom n 1 10 gaussian 2.0 is equivalent to
+\set n random_gaussian(1, 10, 2.0), and uses a gaussian
+distribution.
+   
+  
+ 
+
+   See the documentation of these functions below for further information
+   about the precise shape of these distributions, depending on the value
+   of the parameter.
  
 
  
@@ -965,18 +938,184 @@ f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter))

   
 
+   
+   
+PgBench Functions
+
+ 
+  
+   Function
+   Return Type
+   Description
+   Example
+   Result
+  
+ 
+ 
+  
+   abs(a)
+   same as a
+   integer or double absolute

Re: [HACKERS] extend pgbench expressions with functions

2016-02-01 Thread Robert Haas

On Sat, Jan 30, 2016 at 7:36 AM, Michael Paquier
 wrote:
> On Fri, Jan 29, 2016 at 11:21 PM, Fabien COELHO  wrote:
>> +/* overflow check (needed for INT64_MIN) */
>> +if (lval != 0 && (*retval < 0 == lval < 0))
>>
>> Why not use "if (lval == INT64_MIN)" instead of this complicated condition?
>> If it is really needed for some reason, I think that a comment could help.
>
> Checking for PG_INT64_MIN only would be fine as well, so let's do so.
> I thought honestly that we had better check if the result and the left
> argument are not of the same sign, but well.

Committed and back-patched to 9.5.  Doesn't apply further back.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-02-01 Thread Fabien COELHO



Hello Michaël,


 - remove the short macros (although IMO it is a code degradation)


FWIW, I like this suggestion from Robert.


I'm not especially found of macros, my main reserve is that because of the 
length of the function names this necessarily creates lines longer than 80 
columns or awkward and repeated new lines within expressions, which are 
both ugly.



+make_op(const char *operator, PgBenchExpr *lexpr, PgBenchExpr *rexpr)
+{
+   return make_func(find_func(operator),
+/* beware that the list is
reversed in make_func */
+make_elist(rexpr,
make_elist(lexpr, NULL)));
+}
I think that this should use as argument the function ID, aka PGBENCH_ADD
or similar instead of find_func() with an operator character. This saves a
couple of instructions.


Not that simply: The number is the index in the array of functions, not 
the enum value, they are currently off by one, and there may be the same 
function with different names for some reason some day, so I do not think 
it would be a good idea to enforce the order so that they are identical.

Also this minor search is only executed when parsing the script.


+ * - nargs: number of arguments (-1 is a special value for min & max)
My fault perhaps, it may be better to mention here that -1 means that min
and max need at least one argument, and that the number of arguments is not
fixed.


Ok for a better comment.


For mod() there is no need to have an error, returning 0 is fine. You can
actually do it unconditionally when rval == -1.


Ok.


+   setDoubleValue(retval, d < 0.0? -d: d);
Nitpick: lack of spaces between the question mark.


Ok.


NONE is used nowhere, but I think that you could use it for an assertion
check here: [...]
Just replace the "none" message by Assert(type != PGBT_NONE) for example.


I've added a use of the macro.


Another remaining item: should support for \setrandom be dropped? As the
patch is presented this remains intact.


As you know my opinion is "yes", but I have not receive a clear go about 
that from a committer and I'm not motivated by removing and then re adding 
code to the patch.


If nothing clear is said, I'll do a patch just for that one functions are 
added and submit it to the next CF.


--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 42d0667..d42208a 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -796,17 +796,21 @@ pgbench  options  dbname
   Sets variable varname to an integer value calculated
   from expression.
   The expression may contain integer constants such as 5432,
-  references to variables :variablename,
+  double constants such as 3.14159,
+  references to integer variables :variablename,
   and expressions composed of unary (-) or binary operators
-  (+, -, *, /, %)
-  with their usual associativity, and parentheses.
+  (+, -, *, /,
+  %) with their usual associativity, function calls and
+  parentheses.
+   shows the available
+  functions.
  
 
  
   Examples:
 
 \set ntellers 10 * :scale
-\set aid (1021 * :aid) % (10 * :scale) + 1
+\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1
 
 

@@ -826,66 +830,35 @@ pgbench  options  dbname
  
 
  
-  By default, or when uniform is specified, all values in the
-  range are drawn with equal probability.  Specifying gaussian
-  or  exponential options modifies this behavior; each
-  requires a mandatory parameter which determines the precise shape of the
-  distribution.
- 
+  
+   
+
+ \setrandom n 1 10 or \setrandom n 1 10 uniform
+ is equivalent to \set n random(1, 10) and uses a uniform
+ distribution.
+
+   
 
- 
-  For a Gaussian distribution, the interval is mapped onto a standard
-  normal distribution (the classical bell-shaped Gaussian curve) truncated
-  at -parameter on the left and +parameter
-  on the right.
-  Values in the middle of the interval are more likely to be drawn.
-  To be precise, if PHI(x) is the cumulative distribution
-  function of the standard normal distribution, with mean mu
-  defined as (max + min) / 2.0, with
-
- f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
-(2.0 * PHI(parameter) - 1.0)
-
-  then value i between min and
-  max inclusive is drawn with probability:
-  f(i + 0.5) - f(i - 0.5).
-  Intuitively, the larger parameter, the more
-  frequently values close to the middle of the interval are drawn, and the
-  less frequently values close to the min and
-  max bounds. About 67% of values are drawn from the
-  middle 1.0 / parameter, that is a relative
-  0.5 / parameter around the mean, and 95% in the middle
-  2.0 / parameter, that is a relative
-  1.0 / parameter arou

Re: [HACKERS] extend pgbench expressions with functions

2016-01-31 Thread Michael Paquier

On Fri, Jan 29, 2016 at 5:16 AM, Fabien COELHO <
fabien.coe...@mines-paristech.fr> wrote:
> v22 compared to previous:

Thanks for the new patch!

>  - remove the short macros (although IMO it is a code degradation)

FWIW, I like this suggestion from Robert.

>  - check for INT64_MIN / -1 (although I think it is useless)
>  - try not to remove/add blanks lines
>  - let some assert "as is"
>  - still exit on float to int overflow, see arguments in other mails

+make_op(const char *operator, PgBenchExpr *lexpr, PgBenchExpr *rexpr)
+{
+   return make_func(find_func(operator),
+/* beware that the list is
reversed in make_func */
+make_elist(rexpr,
make_elist(lexpr, NULL)));
+}
I think that this should use as argument the function ID, aka PGBENCH_ADD
or similar instead of find_func() with an operator character. This saves a
couple of instructions.

+ * - nargs: number of arguments (-1 is a special value for min & max)
My fault perhaps, it may be better to mention here that -1 means that min
and max need at least one argument, and that the number of arguments is not
fixed.

+   case PGBENCH_MOD:
+   if (coerceToInt(&rval) == 0)
{
fprintf(stderr,
"division by zero\n");
return false;
}
-   *retval = lval % rval;
+   if (coerceToInt(&lval) ==
INT64_MIN && coerceToInt(&rval) == -1)
+   {
+   fprintf(stderr,
"cannot divide INT64_MIN by -1\n");
+   return false;
+   }
For mod() there is no need to have an error, returning 0 is fine. You can
actually do it unconditionally when rval == -1.

+   setDoubleValue(retval, d < 0.0? -d: d);
Nitpick: lack of spaces between the question mark.

+typedef enum
+{
+   PGBT_NONE,
+   PGBT_INT,
+   PGBT_DOUBLE
+} PgBenchValueType;
NONE is used nowhere, but I think that you could use it for an assertion
check here:
+   if (retval->type == PGBT_INT)
+   fprintf(stderr, "int " INT64_FORMAT "\n",
retval->u.ival);
+   else if (retval->type == PGBT_DOUBLE)
+   fprintf(stderr, "double %f\n",
retval->u.dval);
+   else
+   fprintf(stderr, "none\n");
Just replace the "none" message by Assert(type != PGBT_NONE) for example.

Another remaining item: should support for \setrandom be dropped? As the
patch is presented this remains intact.
-- 
Michael

Re: [HACKERS] extend pgbench expressions with functions

2016-01-31 Thread Fabien COELHO



v22 compared to previous:
 - remove the short macros (although IMO it is a code degradation)
 - try not to remove/add blanks lines
 - let some assert "as is"
 - still exit on float to int overflow, see arguments in other mails
 - check for INT64_MIN / -1 (although I think it is useless)

--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 42d0667..d42208a 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -796,17 +796,21 @@ pgbench  options  dbname
   Sets variable varname to an integer value calculated
   from expression.
   The expression may contain integer constants such as 5432,
-  references to variables :variablename,
+  double constants such as 3.14159,
+  references to integer variables :variablename,
   and expressions composed of unary (-) or binary operators
-  (+, -, *, /, %)
-  with their usual associativity, and parentheses.
+  (+, -, *, /,
+  %) with their usual associativity, function calls and
+  parentheses.
+   shows the available
+  functions.
  
 
  
   Examples:
 
 \set ntellers 10 * :scale
-\set aid (1021 * :aid) % (10 * :scale) + 1
+\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1
 
 

@@ -826,66 +830,35 @@ pgbench  options  dbname
  
 
  
-  By default, or when uniform is specified, all values in the
-  range are drawn with equal probability.  Specifying gaussian
-  or  exponential options modifies this behavior; each
-  requires a mandatory parameter which determines the precise shape of the
-  distribution.
- 
+  
+   
+
+ \setrandom n 1 10 or \setrandom n 1 10 uniform
+ is equivalent to \set n random(1, 10) and uses a uniform
+ distribution.
+
+   
 
- 
-  For a Gaussian distribution, the interval is mapped onto a standard
-  normal distribution (the classical bell-shaped Gaussian curve) truncated
-  at -parameter on the left and +parameter
-  on the right.
-  Values in the middle of the interval are more likely to be drawn.
-  To be precise, if PHI(x) is the cumulative distribution
-  function of the standard normal distribution, with mean mu
-  defined as (max + min) / 2.0, with
-
- f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
-(2.0 * PHI(parameter) - 1.0)
-
-  then value i between min and
-  max inclusive is drawn with probability:
-  f(i + 0.5) - f(i - 0.5).
-  Intuitively, the larger parameter, the more
-  frequently values close to the middle of the interval are drawn, and the
-  less frequently values close to the min and
-  max bounds. About 67% of values are drawn from the
-  middle 1.0 / parameter, that is a relative
-  0.5 / parameter around the mean, and 95% in the middle
-  2.0 / parameter, that is a relative
-  1.0 / parameter around the mean; for instance, if
-  parameter is 4.0, 67% of values are drawn from the
-  middle quarter (1.0 / 4.0) of the interval (i.e. from
-  3.0 / 8.0 to 5.0 / 8.0) and 95% from
-  the middle half (2.0 / 4.0) of the interval (second and
-  third quartiles). The minimum parameter is 2.0 for
-  performance of the Box-Muller transform.
- 
+  
+   
+\setrandom n 1 10 exponential 3.0 is equivalent to
+\set n random_exponential(1, 10, 3.0) and uses an
+exponential distribution.
+   
+  
 
- 
-  For an exponential distribution, parameter
-  controls the distribution by truncating a quickly-decreasing
-  exponential distribution at parameter, and then
-  projecting onto integers between the bounds.
-  To be precise, with
-
-f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter))
-
-  Then value i between min and
-  max inclusive is drawn with probability:
-  f(x) - f(x + 1).
-  Intuitively, the larger parameter, the more
-  frequently values close to min are accessed, and the
-  less frequently values close to max are accessed.
-  The closer to 0 parameter, the flatter (more uniform)
-  the access distribution.
-  A crude approximation of the distribution is that the most frequent 1%
-  values in the range, close to min, are drawn
-  parameter% of the time.
-  parameter value must be strictly positive.
+  
+   
+\setrandom n 1 10 gaussian 2.0 is equivalent to
+\set n random_gaussian(1, 10, 2.0), and uses a gaussian
+distribution.
+   
+  
+ 
+
+   See the documentation of these functions below for further information
+   about the precise shape of these distributions, depending on the value
+   of the parameter.
  
 
  
@@ -965,18 +938,184 @@ f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter))

   
 
+   
+   
+PgBench Functions
+

Re: [HACKERS] extend pgbench expressions with functions

2016-01-30 Thread Michael Paquier

On Fri, Jan 29, 2016 at 11:21 PM, Fabien COELHO  wrote:
> +/* overflow check (needed for INT64_MIN) */
> +if (lval != 0 && (*retval < 0 == lval < 0))
>
> Why not use "if (lval == INT64_MIN)" instead of this complicated condition?
> If it is really needed for some reason, I think that a comment could help.

Checking for PG_INT64_MIN only would be fine as well, so let's do so.
I thought honestly that we had better check if the result and the left
argument are not of the same sign, but well.
-- 
Michael
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index d5f242c..6a17990 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -967,7 +967,28 @@ evaluateExpr(CState *st, PgBenchExpr *expr, int64 *retval)
 			fprintf(stderr, "division by zero\n");
 			return false;
 		}
-		*retval = lval / rval;
+		/*
+		 * INT64_MIN / -1 is problematic, since the result
+		 * can't be represented on a two's-complement
+		 * machine. Some machines produce INT64_MIN, some
+		 * produce zero, some throw an exception. We can
+		 * dodge the problem by recognizing that division
+		 * by -1 is the same as negation.
+		 */
+		if (rval == -1)
+		{
+			*retval = -lval;
+
+			/* overflow check (needed for INT64_MIN) */
+			if (lval == PG_INT64_MIN)
+			{
+fprintf(stderr, "bigint out of range\n");
+return false;
+			}
+		}
+		else
+			*retval = lval / rval;
+
 		return true;
 
 	case '%':
@@ -976,7 +997,15 @@ evaluateExpr(CState *st, PgBenchExpr *expr, int64 *retval)
 			fprintf(stderr, "division by zero\n");
 			return false;
 		}
-		*retval = lval % rval;
+		/*
+		 * Some machines throw a floating-point exception
+		 * for INT64_MIN % -1, the correct answer being
+		 * zero in any case.
+		 */
+		if (rval == -1)
+			*retval = 0;
+		else
+			*retval = lval % rval;
 		return true;
 }
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-01-29 Thread Fabien COELHO




Also, a comment is needed to explain why such a bizarre
condition is used/needed for just the INT64_MIN case.


The last patch I sent has this bit:
+  /*
+   * Some machines throw a floating-point exception
+   * for INT64_MIN % -1, the correct answer being
+   * zero in any case.
+   */
How would you reformulate that à-la-Fabien?


This one about modulo is fine.

I was refering to this other one in the division case:

+/* overflow check (needed for INT64_MIN) */
+if (lval != 0 && (*retval < 0 == lval < 0))

Why not use "if (lval == INT64_MIN)" instead of this complicated 
condition? If it is really needed for some reason, I think that a comment 
could help.


--
Fabien.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-01-29 Thread Michael Paquier

On Fri, Jan 29, 2016 at 6:05 PM, Fabien COELHO  wrote:
> (instead of < in my previous suggestion, if some processors return 0 on
> -INT64_MIN). Also, a comment is needed to explain why such a bizarre
> condition is used/needed for just the INT64_MIN case.

The last patch I sent has this bit:
+  /*
+   * Some machines throw a floating-point exception
+   * for INT64_MIN % -1, the correct answer being
+   * zero in any case.
+   */
How would you reformulate that à-la-Fabien?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-01-29 Thread Fabien COELHO




Hi Michaël,


I think it is overkill, but do as you feel.


Perhaps we could have Robert decide on this one first? That's a bug after
all that had better be backpatched.


Fine with me.

[modulo...] Right, forgot this one, we just need to check if rval is -1 
here, and return 0 as result. I am updating the fix as attached.


This looks to me like it works.

I still feel that the condition should be simplified, probably with:

  if (lval < 0 && *resval <= 0) ...

(instead of < in my previous suggestion, if some processors return 0 on 
-INT64_MIN). Also, a comment is needed to explain why such a bizarre 
condition is used/needed for just the INT64_MIN case.


--
Fabien.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-01-28 Thread Michael Paquier

On Fri, Jan 29, 2016 at 3:40 PM, Fabien COELHO  wrote:

>
> I would as well suggest fixing first the (INT64_MAX / -1) crash on HEAD
>> and back-branches with something like the patch attached, inspired from
>> int8.c.
>>
>
> I think it is overkill, but do as you feel.
>

Perhaps we could have Robert decide on this one first? That's a bug after
all that had better be backpatched.


> Note that it must also handle modulo, but the code you suggest cannot be
> used for that.
>
>   #include 
>   int main(int argc, char* argv[])
>   {
> int64_t l = INT64_MIN;
> int64_t r = -1;
> int64_t d = l % r;
> return 0;
>   }
>   // => Floating point exception (core dumped)
>

Right, forgot this one, we just need to check if rval is -1 here, and
return 0 as result. I am updating the fix as attached.
-- 
Michael
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index d5f242c..25b349d 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -967,7 +967,28 @@ evaluateExpr(CState *st, PgBenchExpr *expr, int64 *retval)
 			fprintf(stderr, "division by zero\n");
 			return false;
 		}
-		*retval = lval / rval;
+		/*
+		 * INT64_MIN / -1 is problematic, since the result
+		 * can't be represented on a two's-complement
+		 * machine. Some machines produce INT64_MIN, some
+		 * produce zero, some throw an exception. We can
+		 * dodge the problem by recognizing that division
+		 * by -1 is the same as negation.
+		 */
+		if (rval == -1)
+		{
+			*retval = -lval;
+
+			/* overflow check (needed for INT64_MIN) */
+			if (lval != 0 && (*retval < 0 == lval < 0))
+			{
+fprintf(stderr, "bigint out of range\n");
+return false;
+			}
+		}
+		else
+			*retval = lval / rval;
+
 		return true;
 
 	case '%':
@@ -976,7 +997,15 @@ evaluateExpr(CState *st, PgBenchExpr *expr, int64 *retval)
 			fprintf(stderr, "division by zero\n");
 			return false;
 		}
-		*retval = lval % rval;
+		/*
+		 * Some machines throw a floating-point exception
+		 * for INT64_MIN % -1, the correct answer being
+		 * zero in any case.
+		 */
+		if (rval == -1)
+			*retval = 0;
+		else
+			*retval = lval % rval;
 		return true;
 }
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-01-28 Thread Fabien COELHO



Hello Michaël,

v23 attached, which does not change the message but does the other fixes.


+if (coerceToInt(&lval) == INT64_MIN && coerceToInt(&rval) == -1)
+{
+   fprintf(stderr, "cannot divide INT64_MIN by -1\n");
+   return false;
+}
Bike-shedding: "bigint out of range" to match what is done in the backend?


ISTM that it is clearer for the user to say that the issue is with the 
division? Otherwise the message does not help much. Well, not that it 
would be printed often...



+/*
+ * Recursive evaluation of an expression in a pgbench script
+ * using the current state of variables.
+ * Returns whether the evaluation was ok,
+ * the value itself is returned through the retval pointer.
+ */
Could you reformat this comment?


I can.


fprintf(stderr,
-"exponential parameter must be
greater than zero (not \"%s\")\n",
-argv[5]);
+ "exponential parameter must be greater than
zero (not \"%s\")\n",
+  argv[5]);
This is some unnecessary diff noise.


Indeed.


+setIntValue(retval, getGaussianRand(thread, arg1, arg2,
 dparam));
Some portions of the code are using tabs instead of spaces between
function arguments.


Indeed.

I would as well suggest fixing first the (INT64_MAX / -1) crash on HEAD 
and back-branches with something like the patch attached, inspired from 
int8.c.


I think it is overkill, but do as you feel.

Note that it must also handle modulo, but the code you suggest cannot be 
used for that.


  #include 
  int main(int argc, char* argv[])
  {
int64_t l = INT64_MIN;
int64_t r = -1;
int64_t d = l % r;
return 0;
  }
  // => Floating point exception (core dumped)

ISTM that the second condition can be simplified, as there is no possible 
issue if lval is positive?


   if (lval < 0 && *resval < 0) { ... }

--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 42d0667..d42208a 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -796,17 +796,21 @@ pgbench  options  dbname
   Sets variable varname to an integer value calculated
   from expression.
   The expression may contain integer constants such as 5432,
-  references to variables :variablename,
+  double constants such as 3.14159,
+  references to integer variables :variablename,
   and expressions composed of unary (-) or binary operators
-  (+, -, *, /, %)
-  with their usual associativity, and parentheses.
+  (+, -, *, /,
+  %) with their usual associativity, function calls and
+  parentheses.
+   shows the available
+  functions.
  
 
  
   Examples:
 
 \set ntellers 10 * :scale
-\set aid (1021 * :aid) % (10 * :scale) + 1
+\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1
 
 

@@ -826,66 +830,35 @@ pgbench  options  dbname
  
 
  
-  By default, or when uniform is specified, all values in the
-  range are drawn with equal probability.  Specifying gaussian
-  or  exponential options modifies this behavior; each
-  requires a mandatory parameter which determines the precise shape of the
-  distribution.
- 
+  
+   
+
+ \setrandom n 1 10 or \setrandom n 1 10 uniform
+ is equivalent to \set n random(1, 10) and uses a uniform
+ distribution.
+
+   
 
- 
-  For a Gaussian distribution, the interval is mapped onto a standard
-  normal distribution (the classical bell-shaped Gaussian curve) truncated
-  at -parameter on the left and +parameter
-  on the right.
-  Values in the middle of the interval are more likely to be drawn.
-  To be precise, if PHI(x) is the cumulative distribution
-  function of the standard normal distribution, with mean mu
-  defined as (max + min) / 2.0, with
-
- f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
-(2.0 * PHI(parameter) - 1.0)
-
-  then value i between min and
-  max inclusive is drawn with probability:
-  f(i + 0.5) - f(i - 0.5).
-  Intuitively, the larger parameter, the more
-  frequently values close to the middle of the interval are drawn, and the
-  less frequently values close to the min and
-  max bounds. About 67% of values are drawn from the
-  middle 1.0 / parameter, that is a relative
-  0.5 / parameter around the mean, and 95% in the middle
-  2.0 / parameter, that is a relative
-  1.0 / parameter around the mean; for instance, if
-  parameter is 4.0, 67% of values are drawn from the
-  middle quarter (1.0 / 4.0) of the interval (i.e. from
-  3.0 / 8.0 to 5.0 / 8.0) and 95% from
-  the middle half (2.0 / 4.0) of the interval (second and
-  third quartiles). The minimum parameter is 2.0 for
-  performance of the Box-Muller trans

Re: [HACKERS] extend pgbench expressions with functions

2016-01-28 Thread Fabien COELHO





v22 compared to previous:
 - remove the short macros (although IMO it is a code degradation)
 - try not to remove/add blanks lines
 - let some assert "as is"
 - still exit on float to int overflow, see arguments in other mails
 - check for INT64_MIN / -1 (although I think it is useless)

--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 42d0667..d42208a 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -796,17 +796,21 @@ pgbench  options  dbname
   Sets variable varname to an integer value calculated
   from expression.
   The expression may contain integer constants such as 5432,
-  references to variables :variablename,
+  double constants such as 3.14159,
+  references to integer variables :variablename,
   and expressions composed of unary (-) or binary operators
-  (+, -, *, /, %)
-  with their usual associativity, and parentheses.
+  (+, -, *, /,
+  %) with their usual associativity, function calls and
+  parentheses.
+   shows the available
+  functions.
  
 
  
   Examples:
 
 \set ntellers 10 * :scale
-\set aid (1021 * :aid) % (10 * :scale) + 1
+\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1
 
 

@@ -826,66 +830,35 @@ pgbench  options  dbname
  
 
  
-  By default, or when uniform is specified, all values in the
-  range are drawn with equal probability.  Specifying gaussian
-  or  exponential options modifies this behavior; each
-  requires a mandatory parameter which determines the precise shape of the
-  distribution.
- 
+  
+   
+
+ \setrandom n 1 10 or \setrandom n 1 10 uniform
+ is equivalent to \set n random(1, 10) and uses a uniform
+ distribution.
+
+   
 
- 
-  For a Gaussian distribution, the interval is mapped onto a standard
-  normal distribution (the classical bell-shaped Gaussian curve) truncated
-  at -parameter on the left and +parameter
-  on the right.
-  Values in the middle of the interval are more likely to be drawn.
-  To be precise, if PHI(x) is the cumulative distribution
-  function of the standard normal distribution, with mean mu
-  defined as (max + min) / 2.0, with
-
- f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
-(2.0 * PHI(parameter) - 1.0)
-
-  then value i between min and
-  max inclusive is drawn with probability:
-  f(i + 0.5) - f(i - 0.5).
-  Intuitively, the larger parameter, the more
-  frequently values close to the middle of the interval are drawn, and the
-  less frequently values close to the min and
-  max bounds. About 67% of values are drawn from the
-  middle 1.0 / parameter, that is a relative
-  0.5 / parameter around the mean, and 95% in the middle
-  2.0 / parameter, that is a relative
-  1.0 / parameter around the mean; for instance, if
-  parameter is 4.0, 67% of values are drawn from the
-  middle quarter (1.0 / 4.0) of the interval (i.e. from
-  3.0 / 8.0 to 5.0 / 8.0) and 95% from
-  the middle half (2.0 / 4.0) of the interval (second and
-  third quartiles). The minimum parameter is 2.0 for
-  performance of the Box-Muller transform.
- 
+  
+   
+\setrandom n 1 10 exponential 3.0 is equivalent to
+\set n random_exponential(1, 10, 3.0) and uses an
+exponential distribution.
+   
+  
 
- 
-  For an exponential distribution, parameter
-  controls the distribution by truncating a quickly-decreasing
-  exponential distribution at parameter, and then
-  projecting onto integers between the bounds.
-  To be precise, with
-
-f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter))
-
-  Then value i between min and
-  max inclusive is drawn with probability:
-  f(x) - f(x + 1).
-  Intuitively, the larger parameter, the more
-  frequently values close to min are accessed, and the
-  less frequently values close to max are accessed.
-  The closer to 0 parameter, the flatter (more uniform)
-  the access distribution.
-  A crude approximation of the distribution is that the most frequent 1%
-  values in the range, close to min, are drawn
-  parameter% of the time.
-  parameter value must be strictly positive.
+  
+   
+\setrandom n 1 10 gaussian 2.0 is equivalent to
+\set n random_gaussian(1, 10, 2.0), and uses a gaussian
+distribution.
+   
+  
+ 
+
+   See the documentation of these functions below for further information
+   about the precise shape of these distributions, depending on the value
+   of the parameter.
  
 
  
@@ -965,18 +938,184 @@ f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter))

   
 
+   
+   
+PgBench Functions
+

Re: [HACKERS] extend pgbench expressions with functions

2016-01-28 Thread Fabien COELHO




So I'm arguing that exiting, with an error message, is better than handling
user errors.


I'm not objecting to exiting with an error message, but I think
letting ourselves be killed by a signal is no good.


Ok, I understand this point for this purpose.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-01-28 Thread Robert Haas

On Thu, Jan 28, 2016 at 1:36 AM, Fabien COELHO  wrote:
> So I'm arguing that exiting, with an error message, is better than handling
> user errors.

I'm not objecting to exiting with an error message, but I think
letting ourselves be killed by a signal is no good.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-01-28 Thread Fabien COELHO




I do not think that it is really worth fixing, but I will not prevent anyone
to fix it.


I still think it does. Well, if there is consensus to address this one
and optionally the other integer overflows even on back branches, I'll
write a patch and let's call that a deal. This is not a problem from
my side.


My point is just about the cost-benefit of fixing a low probability issue 
that you can only encounter if you are looking for it, and not with any 
reasonable bench script.


Now adding somewhere a test might just help closing the subject and 
do more useful things, so that would also be a win.


  /* these would raise an arithmetic error */
  if (lval == INT64_MIN && rval == -1)
  {
 fprintf(stderr, "cannot divide or modulo INT64_MIN by -1\n");
 return false;
  }

This may be backpatched to old supported versions.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-01-27 Thread Fabien COELHO



Hello Robert,


Pgbench is a bench tool, not a production tool.


I don't really see how that's relevant.


My point is that I do not see any added value for pgbench to keep on 
executing a performance bench if some clients die due to script errors: it 
is more useful to stop the whole bench and report the issue, so the user 
can fix their script, than to keep going on with the remaining clients, 
from a benchmarking perspective.


So I'm arguing that exiting, with an error message, is better than 
handling user errors.


When I run a program and it dies after causing the operating system to 
send it a fatal signal, I say to myself "self, that program has a bug". 
Other people may have different reactions, but that's mine.


I was talking about an exit call generated on a float to int conversion 
error when there is an error in the user script. The bug is in the user 
script and this is clearly reported by pgbench.


However, your argument may be relevant for avoiding fatal signal such as 
generated by INT64_MAX / -1, because on this one the message error is 
terse so how to fix the issue is not clear to the user.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-01-27 Thread Robert Haas

On Wed, Jan 27, 2016 at 5:43 PM, Fabien COELHO  wrote:
> Pgbench is a bench tool, not a production tool.

I don't really see how that's relevant.  When I run a program and it
dies after causing the operating system to send it a fatal signal, I
say to myself "self, that program has a bug".  Other people may have
different reactions, but that's mine.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-01-27 Thread Michael Paquier

On Thu, Jan 28, 2016 at 7:07 AM, Fabien COELHO  wrote:
> I do not think that it is really worth fixing, but I will not prevent anyone
> to fix it.

I still think it does. Well, if there is consensus to address this one
and optionally the other integer overflows even on back branches, I'll
write a patch and let's call that a deal. This is not a problem from
my side.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-01-27 Thread Fabien COELHO



Hello Robert,


Attached is a rebase after recent changes in pgbench code & doc.


+/* use short names in the evaluator */
+#define INT(v) coerceToInt(&v)
+#define DOUBLE(v) coerceToDouble(&v)
+#define SET_INT(pv, ival) setIntValue(pv, ival)
+#define SET_DOUBLE(pv, dval) setDoubleValue(pv, dval)

I don't like this at all.  It seems to me that this really obscures
the code.  The few extra characters are a small price to pay for not
having to go look up the macro definition to understand what the code
is doing.


Hmmm. Postgres indentation rules for "switch" are peculiar to say the 
least and make it hard to write code that stay under 80 columns. The 
coerceToInt function name looks pretty long (I would rather have 
toInt/toDbl/setInt/setDbl) but I was "told" to use that, so I'm trying to 
find a tradeoff with a macro. Obviously I can substitude and have rather 
long lines that I personally find much uglier.



The third hunk in pgbench.c unnecessary deletes a blank line.


Yep, that is possible.


   /*
* inner expresion in (cut, 1] (if parameter > 0), rand in [0, 1)
+* Assert((1.0 - cut) != 0.0);
*/
-   Assert((1.0 - cut) != 0.0);
   rand = -log(cut + (1.0 - cut) * uniform) / parameter;
+

Moving the Assert() into the comment seems like a bad plan.  If the
Assert is true, it shouldn't be commented out.  If it's not, it
shouldn't be there at all.


I put this assertion when I initially wrote this code, but I think that it 
is proven so I moved it in comment just as a reminder for someone who 
might touch anything that this must hold.



Commit e41beea0ddb74ef975f08b917a354ec33cb60830, which you wrote, went
to some trouble to display good context for error messages.  What you
have here seems like a huge step backwards:

+   fprintf(stderr, "double to int overflow for
%f\n", dval);
+   exit(1);

So, we're just going to give up on all of that error context reporting
that we added back then?  That would be sad.


Well, I'm a lazy programmer, so I'm trying to measure the benefit. IMO 
there is no benefit to better manage this case, especially as the various 
solution I thought of where either ugly and/or had a significant impact on 
the code.


Note that in the best case the error would be detected and reported and 
the client is stopped, and other clients go on... But then, if you started 
a bench and some clients die while running probably your results are 
meaningless, so my opinion is that you are better off with an exit than
with some message that you may miss and performance results computed with 
much less clients than you asked for.


Pgbench is a bench tool, not a production tool.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-01-27 Thread Fabien COELHO



Hello Robert,


And this one generates a core dump:
\set cid debug(-9223372036854775808 / -1)
Floating point exception: 8 (core dumped)


That does not seem acceptable to me. I don't want pgbench to die a 
horrible death if a floating point exception occurs any more than I want 
the server to do the same.  I want the error to be properly caught and 
reported.


Please note that this issue/bug/feature/whatever already exists just for 
these two values (1/2**128 probability), it is not related to this patch 
about functions.


  sh> pgbench --version
  pgbench (PostgreSQL) 9.5.0

  sh> cat div.sql
  \set i -9223372036854775807
  \set i :i - 1
  \set i :i / -1

  sh> pgbench -f div.sql
  starting vacuum...end.
  Floating point exception (core dumped)

I do not think that it is really worth fixing, but I will not prevent 
anyone to fix it.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-01-27 Thread Robert Haas

On Wed, Jan 27, 2016 at 3:39 AM, Fabien COELHO  wrote:
> Attached is a rebase after recent changes in pgbench code & doc.

+/* use short names in the evaluator */
+#define INT(v) coerceToInt(&v)
+#define DOUBLE(v) coerceToDouble(&v)
+#define SET_INT(pv, ival) setIntValue(pv, ival)
+#define SET_DOUBLE(pv, dval) setDoubleValue(pv, dval)

I don't like this at all.  It seems to me that this really obscures
the code.  The few extra characters are a small price to pay for not
having to go look up the macro definition to understand what the code
is doing.

The third hunk in pgbench.c unnecessary deletes a blank line.

/*
 * inner expresion in (cut, 1] (if parameter > 0), rand in [0, 1)
+* Assert((1.0 - cut) != 0.0);
 */
-   Assert((1.0 - cut) != 0.0);
rand = -log(cut + (1.0 - cut) * uniform) / parameter;
+

Moving the Assert() into the comment seems like a bad plan.  If the
Assert is true, it shouldn't be commented out.  If it's not, it
shouldn't be there at all.

Commit e41beea0ddb74ef975f08b917a354ec33cb60830, which you wrote, went
to some trouble to display good context for error messages.  What you
have here seems like a huge step backwards:

+   fprintf(stderr, "double to int overflow for
%f\n", dval);
+   exit(1);

So, we're just going to give up on all of that error context reporting
that we added back then?  That would be sad.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-01-27 Thread Robert Haas

On Sat, Jan 16, 2016 at 1:10 PM, Fabien COELHO  wrote:
> All these results are fine from my point of view.
>
>> And this one generates a core dump:
>> \set cid debug(-9223372036854775808 / -1)
>> Floating point exception: 8 (core dumped)

That does not seem acceptable to me.  I don't want pgbench to die a
horrible death if a floating point exception occurs any more than I
want the server to do the same.  I want the error to be properly
caught and reported.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-01-27 Thread Fabien COELHO




OK, so I had an extra look at this patch and I am marking it as ready
for committer.


Ok.


Attached is a rebase after recent changes in pgbench code & doc.

--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 42d0667..d42208a 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -796,17 +796,21 @@ pgbench  options  dbname
   Sets variable varname to an integer value calculated
   from expression.
   The expression may contain integer constants such as 5432,
-  references to variables :variablename,
+  double constants such as 3.14159,
+  references to integer variables :variablename,
   and expressions composed of unary (-) or binary operators
-  (+, -, *, /, %)
-  with their usual associativity, and parentheses.
+  (+, -, *, /,
+  %) with their usual associativity, function calls and
+  parentheses.
+   shows the available
+  functions.
  
 
  
   Examples:
 
 \set ntellers 10 * :scale
-\set aid (1021 * :aid) % (10 * :scale) + 1
+\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1
 
 

@@ -826,66 +830,35 @@ pgbench  options  dbname
  
 
  
-  By default, or when uniform is specified, all values in the
-  range are drawn with equal probability.  Specifying gaussian
-  or  exponential options modifies this behavior; each
-  requires a mandatory parameter which determines the precise shape of the
-  distribution.
- 
+  
+   
+
+ \setrandom n 1 10 or \setrandom n 1 10 uniform
+ is equivalent to \set n random(1, 10) and uses a uniform
+ distribution.
+
+   
 
- 
-  For a Gaussian distribution, the interval is mapped onto a standard
-  normal distribution (the classical bell-shaped Gaussian curve) truncated
-  at -parameter on the left and +parameter
-  on the right.
-  Values in the middle of the interval are more likely to be drawn.
-  To be precise, if PHI(x) is the cumulative distribution
-  function of the standard normal distribution, with mean mu
-  defined as (max + min) / 2.0, with
-
- f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
-(2.0 * PHI(parameter) - 1.0)
-
-  then value i between min and
-  max inclusive is drawn with probability:
-  f(i + 0.5) - f(i - 0.5).
-  Intuitively, the larger parameter, the more
-  frequently values close to the middle of the interval are drawn, and the
-  less frequently values close to the min and
-  max bounds. About 67% of values are drawn from the
-  middle 1.0 / parameter, that is a relative
-  0.5 / parameter around the mean, and 95% in the middle
-  2.0 / parameter, that is a relative
-  1.0 / parameter around the mean; for instance, if
-  parameter is 4.0, 67% of values are drawn from the
-  middle quarter (1.0 / 4.0) of the interval (i.e. from
-  3.0 / 8.0 to 5.0 / 8.0) and 95% from
-  the middle half (2.0 / 4.0) of the interval (second and
-  third quartiles). The minimum parameter is 2.0 for
-  performance of the Box-Muller transform.
- 
+  
+   
+\setrandom n 1 10 exponential 3.0 is equivalent to
+\set n random_exponential(1, 10, 3.0) and uses an
+exponential distribution.
+   
+  
 
- 
-  For an exponential distribution, parameter
-  controls the distribution by truncating a quickly-decreasing
-  exponential distribution at parameter, and then
-  projecting onto integers between the bounds.
-  To be precise, with
-
-f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter))
-
-  Then value i between min and
-  max inclusive is drawn with probability:
-  f(x) - f(x + 1).
-  Intuitively, the larger parameter, the more
-  frequently values close to min are accessed, and the
-  less frequently values close to max are accessed.
-  The closer to 0 parameter, the flatter (more uniform)
-  the access distribution.
-  A crude approximation of the distribution is that the most frequent 1%
-  values in the range, close to min, are drawn
-  parameter% of the time.
-  parameter value must be strictly positive.
+  
+   
+\setrandom n 1 10 gaussian 2.0 is equivalent to
+\set n random_gaussian(1, 10, 2.0), and uses a gaussian
+distribution.
+   
+  
+ 
+
+   See the documentation of these functions below for further information
+   about the precise shape of these distributions, depending on the value
+   of the parameter.
  
 
  
@@ -965,18 +938,184 @@ f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter))

   
 
+   
+   
+PgBench Functions
+
+ 
+  
+   Function
+   Return Type
+   Description
+   Example
+   Result
+  
+ 
+ 
+

Re: [HACKERS] extend pgbench expressions with functions

2016-01-18 Thread Fabien COELHO




OK, so I had an extra look at this patch and I am marking it as ready
for committer.


Ok.


- INT64_MIN / -1 throws a core dump, and errors on HEAD. I think this
should be fixed, Fabien does not.


Yep. Another point about this one is that it is not related to this patch 
about functions.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-01-18 Thread Fabien COELHO




The basic operator functions also do not check for integer overflows.


This is a feature. I think that they should not check for overflow, as in C,
this is just int64_t arithmetic "as is".


(int64_t is not an available type on Windows btw.)


Possibly. I really meant "64 bits signed integers", whatever its name. 
"int64_t" is the standard C99 name.


Finally I can think of good reason to use overflows deliberately, so I 
think it would argue against such a change.



Could you show up an example? I am curious about that.


The one I can think of is the use of "SUM" to aggregate hashes for 
computing a hash on a table. If SUM would overflow and stop this would 
break the usage. Now there could be a case for having an overflow 
detection on SUM, but that would be another SUM, preferably with a 
distinct name.



\set cid debug(9223372036854775807 * 9223372036854775807)
debug(script=0,command=3): int 1

All these results are fine from my point of view.


On HEAD we are getting similar strange results,


Yep, this is not new.


so I am fine to withdraw but that's still very strange to me.


Arithmetic operator modulo are pretty strange, I can agree with that:-)

The first case is generating -9223372036854775808, the second one 
compiles 9223372036854775807 and the third one generates 1.


This should be the "real" result modulo 2**64, if I'm not mistaken.

Or we make the error checks even more consistent in back-branches, 
perhaps that 's indeed not worth it for a client though.


Yep, that would be another patch.


And this one generates a core dump:
\set cid debug(-9223372036854775808 / -1)
Floating point exception: 8 (core dumped)


This one is funny, but it is a fact of int64_t life: you cannot divide
INT64_MIN by -1 because the result cannot be represented as an int64_t.
This is propably hardcoded in the processor. I do not think it is worth
doing anything about it for pgbench.


This one, on the contrary, is generating an error on HEAD, and your 
patch is causing a crash: value "9223372036854775808" is out of range 
for type bigint That's a regression, no?


Hmmm, yes, somehow, but just for this one value, if you try the next:

pgbench 9.4.5: value "-9223372036854775809" is out of range for type bigint

I guess that the implementation before 9.5 converted 
"-9223372036854775808" as an int, which is INT64_MIN, so it was fine. Now 
it is parsed as "operator uminus" applied to "9223372036854775808", which 
is not fine because this would be INT64_MAX+1, which is not possible.


I would prefer just to neglect that as a very small (1/2**64) feature 
change rather than a meaningful regression, especially as the coding 
effort to fix this is significant and the value of handling it differently 
is nought.


I am uncomfortable with the fact of letting such holes in the code, even 
if that's a client application.


This is a 2**128 probability case which stops pgbench. It is obviously 
possible to add a check to catch it, and then generate an error message, 
but I would rather just ignore it and let pgbench stop on that.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-01-17 Thread Michael Paquier

On Mon, Jan 18, 2016 at 10:07 AM, Michael Paquier
 wrote:
> On Sun, Jan 17, 2016 at 3:10 AM, Fabien COELHO  wrote:
>> I put the function evaluation in a function in the attached version.
>
> Thanks, this makes the code a bit clearer.

OK, so I had an extra look at this patch and I am marking it as ready
for committer. A couple of things to be aware of, and the result of
this thread with the patch in its current state:
- The patch is keeping \setrandom. Fabien and I are agreeing to purely
remove it, though it is kept in the patch because it is easier to
remove existing code rather than add it again per Fabien's concerns.
- INT64_MIN / -1 throws a core dump, and errors on HEAD. I think this
should be fixed, Fabien does not.
- There are not many overflow checks for the exiting int64 operators
and functions. HEAD doesn't do much, this patch makes it the situation
a bit worse even if there are a couple of checks for int() for
example. We do not do any checks for sqrt(-N) (N > 0) for example.
- It may be more interesting to have the function execution code into
a separate file for clarity. Not mandatory though.
Except those comments, all the other issues have been addressed. I
think this is a great patch, and greatly improves the extensibility of
pgbench.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-01-17 Thread Michael Paquier

On Sun, Jan 17, 2016 at 3:10 AM, Fabien COELHO  wrote:
>> With this example:
>> \set cid debug(sqrt(-1))
>> I get that:
>> debug(script=0,command=1): double nan
>> An error would be more logical, no?
>
>
> If "sqrt(-1)" as a double is Nan for the computer, I'm fine with that. It
> makes the code simpler to just let the math library do its stuff and not
> bother.

Hm. OK..

>> The basic operator functions also do not check for integer overflows.
>
>
> This is a feature. I think that they should not check for overflow, as in C,
> this is just int64_t arithmetic "as is".

(int64_t is not an available type on Windows btw.)

> Moreover, it would be a new feature to add such a check if desirable, so it
> would belong to another patch, it is not related to adding functions.
> The addition already overflows in the current code.
>
> Finally I can think of good reason to use overflows deliberately, so I think
> it would argue against such a change.

Could you show up an example? I am curious about that.

>> Those three ones are just overflowing:
>> \set cid debug(9223372036854775807 + 1)
>> \set cid debug(-9223372036854775808 - 1)
>> \set cid debug(9223372036854775807 * 9223372036854775807)
>> debug(script=0,command=1): int -9223372036854775807
>> debug(script=0,command=2): int 9223372036854775807
>> debug(script=0,command=3): int 1
> All these results are fine from my point of view.

On HEAD we are getting similar strange results, so I am fine to
withdraw but that's still very strange to me. The first case is
generating -9223372036854775808, the second one compiles
9223372036854775807 and the third one generates 1. Or we make the
error checks even more consistent in back-branches, perhaps that 's
indeed not worth it for a client though.

>> And this one generates a core dump:
>> \set cid debug(-9223372036854775808 / -1)
>> Floating point exception: 8 (core dumped)
>
> This one is funny, but it is a fact of int64_t life: you cannot divide
> INT64_MIN by -1 because the result cannot be represented as an int64_t.
> This is propably hardcoded in the processor. I do not think it is worth
> doing anything about it for pgbench.

This one, on the contrary, is generating an error on HEAD, and your
patch is causing a crash:
value "9223372036854775808" is out of range for type bigint
That's a regression, no? I am uncomfortable with the fact of letting
such holes in the code, even if that's a client application.

>> A more general comment: what about splitting all the execution
>> functions into a separate file exprexec.c? evaluateExpr (renamed as
>> execExpr) is the root function, but then we have a set of static
>> sub-functions for each node, like execExprFunc, execExprVar,
>> execExprConst, etc?
>
> I do not see a strong case for renaming. The function part could be split
> because of the indentation, though.

The split makes sense to me regarding the fact that we have function
parsing, execution and the main code paths in different files. That's
not far from the backend that does similar split actually.

>> This way we would save a bit of tab-indentation, this patch making the new
>> code lines becoming larger than 80 characters because of all the switch/case
>> stuff that gets more complicated.
>
> I agree that the code is pretty ugly, but this is partly due to postgres
> indentation rules for switch which are *NOT* reasonnable, IMO.
>
> I put the function evaluation in a function in the attached version.

Thanks, this makes the code a bit clearer.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-01-16 Thread Fabien COELHO



Hello Michaël,


+   uniformly-distributed random integer in [lb,ub]



Nitpick: when defining an interval like that, you may want to add a
space after the comma.


Why not.


+  /* beware that the list is reverse in make_func */
s/reverse/reversed/?


Indeed.


+
#ifdef DEBUG
Some noise.


Ok.


With this example:
\set cid debug(sqrt(-1))
I get that:
debug(script=0,command=1): double nan
An error would be more logical, no?


If "sqrt(-1)" as a double is Nan for the computer, I'm fine with that. It 
makes the code simpler to just let the math library do its stuff and not 
bother.



You want to emulate with complex numbers instead?


Nope.


The basic operator functions also do not check for integer overflows.


This is a feature. I think that they should not check for overflow, as in 
C, this is just int64_t arithmetic "as is".


Moreover, it would be a new feature to add such a check if desirable, so 
it would belong to another patch, it is not related to adding functions.

The addition already overflows in the current code.

Finally I can think of good reason to use overflows deliberately, so I 
think it would argue against such a change.



Those three ones are just overflowing:
\set cid debug(9223372036854775807 + 1)
\set cid debug(-9223372036854775808 - 1)
\set cid debug(9223372036854775807 * 9223372036854775807)
debug(script=0,command=1): int -9223372036854775807
debug(script=0,command=2): int 9223372036854775807
debug(script=0,command=3): int 1


All these results are fine from my point of view.


And this one generates a core dump:
\set cid debug(-9223372036854775808 / -1)
Floating point exception: 8 (core dumped)


This one is funny, but it is a fact of int64_t life: you cannot divide 
INT64_MIN by -1 because the result cannot be represented as an int64_t.
This is propably hardcoded in the processor. I do not think it is worth 
doing anything about it for pgbench.



A more general comment: what about splitting all the execution
functions into a separate file exprexec.c? evaluateExpr (renamed as
execExpr) is the root function, but then we have a set of static
sub-functions for each node, like execExprFunc, execExprVar,
execExprConst, etc?


I do not see a strong case for renaming. The function part could be split 
because of the indentation, though.


This way we would save a bit of tab-indentation, this patch making the 
new code lines becoming larger than 80 characters because of all the 
switch/case stuff that gets more complicated.


I agree that the code is pretty ugly, but this is partly due to postgres 
indentation rules for switch which are *NOT* reasonnable, IMO.


I put the function evaluation in a function in the attached version.

--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 541d17b..0767b46 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -771,17 +771,21 @@ pgbench  options  dbname
   Sets variable varname to an integer value calculated
   from expression.
   The expression may contain integer constants such as 5432,
-  references to variables :variablename,
+  double constants such as 3.14159,
+  references to integer variables :variablename,
   and expressions composed of unary (-) or binary operators
-  (+, -, *, /, %)
-  with their usual associativity, and parentheses.
+  (+, -, *, /,
+  %) with their usual associativity, function calls and
+  parentheses.
+   shows the available
+  functions.
  
 
  
   Examples:
 
 \set ntellers 10 * :scale
-\set aid (1021 * :aid) % (10 * :scale) + 1
+\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1
 
 

@@ -801,66 +805,35 @@ pgbench  options  dbname
  
 
  
-  By default, or when uniform is specified, all values in the
-  range are drawn with equal probability.  Specifying gaussian
-  or  exponential options modifies this behavior; each
-  requires a mandatory parameter which determines the precise shape of the
-  distribution.
- 
+  
+   
+
+ \setrandom n 1 10 or \setrandom n 1 10 uniform
+ is equivalent to \set n random(1, 10) and uses a uniform
+ distribution.
+
+   
 
- 
-  For a Gaussian distribution, the interval is mapped onto a standard
-  normal distribution (the classical bell-shaped Gaussian curve) truncated
-  at -parameter on the left and +parameter
-  on the right.
-  Values in the middle of the interval are more likely to be drawn.
-  To be precise, if PHI(x) is the cumulative distribution
-  function of the standard normal distribution, with mean mu
-  defined as (max + min) / 2.0, with
-
- f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
-(2.0 * PHI(parameter) - 1.0)
-
-  then value i between min and
-  max inclusive is drawn with probability:
-  f(i + 0.5) - f(i - 0.5).
-  Intuitively, the

Re: [HACKERS] extend pgbench expressions with functions

2016-01-16 Thread Michael Paquier

On Fri, Jan 15, 2016 at 11:53 PM, Fabien COELHO  wrote:
> Here is a v19 :
>  - avoid noisy changes
>  - abort on double->int overflow
>  - implement operators as functions
>
> There is still \setrandom, that I can remove easily with a green light.

(I am not sure why *$%"# gmail broke the thread in my last email)

Thanks for the new patch and replacing the operator stuff by functions.

+   uniformly-distributed random integer in [lb,ub]
Nitpick: when defining an interval like that, you may want to add a
space after the comma. For example seg.sgml does that. It would be
good to be consistent even here. And actually you wrote [ub, lb] in
two places, this should have been reversed.

+  /* beware that the list is reverse in make_func */
s/reverse/reversed/?

}
+
 #ifdef DEBUG
Some noise.

With this example:
\set cid debug(sqrt(-1))
I get that:
debug(script=0,command=1): double nan
An error would be more logical, no? You want to emulate with complex
numbers instead?

The basic operator functions also do not check for integer overflows.
Those three ones are just overflowing:
\set cid debug(9223372036854775807 + 1)
\set cid debug(-9223372036854775808 - 1)
\set cid debug(9223372036854775807 * 9223372036854775807)
debug(script=0,command=1): int -9223372036854775807
debug(script=0,command=2): int 9223372036854775807
debug(script=0,command=3): int 1
And this one generates a core dump:
\set cid debug(-9223372036854775808 / -1)
Floating point exception: 8 (core dumped)

A more general comment: what about splitting all the execution
functions into a separate file exprexec.c? evaluateExpr (renamed as
execExpr) is the root function, but then we have a set of static
sub-functions for each node, like execExprFunc, execExprVar,
execExprConst, etc? This way we would save a bit of tab-indentation,
this patch making the new code lines becoming larger than 80
characters because of all the switch/case stuff that gets more
complicated.
-- 
Michael

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-01-15 Thread Fabien COELHO



Hello Michaël,

Here is a v19 :
 - avoid noisy changes
 - abort on double->int overflow
 - implement operators as functions

There is still \setrandom, that I can remove easily with a green light.

--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 541d17b..a8bfc79 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -771,17 +771,21 @@ pgbench  options  dbname
   Sets variable varname to an integer value calculated
   from expression.
   The expression may contain integer constants such as 5432,
-  references to variables :variablename,
+  double constants such as 3.14159,
+  references to integer variables :variablename,
   and expressions composed of unary (-) or binary operators
-  (+, -, *, /, %)
-  with their usual associativity, and parentheses.
+  (+, -, *, /,
+  %) with their usual associativity, function calls and
+  parentheses.
+   shows the available
+  functions.
  
 
  
   Examples:
 
 \set ntellers 10 * :scale
-\set aid (1021 * :aid) % (10 * :scale) + 1
+\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1
 
 

@@ -801,66 +805,35 @@ pgbench  options  dbname
  
 
  
-  By default, or when uniform is specified, all values in the
-  range are drawn with equal probability.  Specifying gaussian
-  or  exponential options modifies this behavior; each
-  requires a mandatory parameter which determines the precise shape of the
-  distribution.
- 
+  
+   
+
+ \setrandom n 1 10 or \setrandom n 1 10 uniform
+ is equivalent to \set n random(1, 10) and uses a uniform
+ distribution.
+
+   
 
- 
-  For a Gaussian distribution, the interval is mapped onto a standard
-  normal distribution (the classical bell-shaped Gaussian curve) truncated
-  at -parameter on the left and +parameter
-  on the right.
-  Values in the middle of the interval are more likely to be drawn.
-  To be precise, if PHI(x) is the cumulative distribution
-  function of the standard normal distribution, with mean mu
-  defined as (max + min) / 2.0, with
-
- f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
-(2.0 * PHI(parameter) - 1.0)
-
-  then value i between min and
-  max inclusive is drawn with probability:
-  f(i + 0.5) - f(i - 0.5).
-  Intuitively, the larger parameter, the more
-  frequently values close to the middle of the interval are drawn, and the
-  less frequently values close to the min and
-  max bounds. About 67% of values are drawn from the
-  middle 1.0 / parameter, that is a relative
-  0.5 / parameter around the mean, and 95% in the middle
-  2.0 / parameter, that is a relative
-  1.0 / parameter around the mean; for instance, if
-  parameter is 4.0, 67% of values are drawn from the
-  middle quarter (1.0 / 4.0) of the interval (i.e. from
-  3.0 / 8.0 to 5.0 / 8.0) and 95% from
-  the middle half (2.0 / 4.0) of the interval (second and
-  third quartiles). The minimum parameter is 2.0 for
-  performance of the Box-Muller transform.
- 
+  
+   
+\setrandom n 1 10 exponential 3.0 is equivalent to
+\set n random_exponential(1, 10, 3.0) and uses an
+exponential distribution.
+   
+  
 
- 
-  For an exponential distribution, parameter
-  controls the distribution by truncating a quickly-decreasing
-  exponential distribution at parameter, and then
-  projecting onto integers between the bounds.
-  To be precise, with
-
-f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter))
-
-  Then value i between min and
-  max inclusive is drawn with probability:
-  f(x) - f(x + 1).
-  Intuitively, the larger parameter, the more
-  frequently values close to min are accessed, and the
-  less frequently values close to max are accessed.
-  The closer to 0 parameter, the flatter (more uniform)
-  the access distribution.
-  A crude approximation of the distribution is that the most frequent 1%
-  values in the range, close to min, are drawn
-  parameter% of the time.
-  parameter value must be strictly positive.
+  
+   
+\setrandom n 1 10 gaussian 2.0 is equivalent to
+\set n random_gaussian(1, 10, 2.0), and uses a gaussian
+distribution.
+   
+  
+ 
+
+   See the documentation of these functions below for further information
+   about the precise shape of these distributions, depending on the value
+   of the parameter.
  
 
  
@@ -940,18 +913,184 @@ f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter))

   
 
+   
+   
+PgBench Functions
+
+ 
+  
+   Function
+   Return Type
+   Description
+   Example
+

Re: [HACKERS] extend pgbench expressions with functions

2016-01-14 Thread Michael Paquier

On Thu, Jan 14, 2016 at 5:54 PM, Fabien COELHO  wrote:
> I wrote:
>> - fprintf(stderr, "gaussian parameter must be at least
>> %f (not \"%s\")\n", MIN_GAUSSIAN_PARAM, argv[5]);
>> +fprintf(stderr,
>> +   "random gaussian parameter must be greater
>> than %f "
>> +"(got %f)\n", MIN_GAUSSIAN_PARAM, parameter);
>> This looks like useless noise to me. Why changing those messages?
>
>
> Because the message was not saying that it was about random, and I think
> that it is the important.
>>  fprintf(stderr,
>> -   "exponential parameter must be greater than
>> zero (not \"%s\")\n",
>> -argv[5]);
>> +  "random exponential parameter must be greater than
>> 0.0 "
>> +   "(got %f)\n", parameter);
>>  st->ecnt++;
>> -return true;
>> +   return false;
>> This diff is noise as well, and should be removed.
>
> Ok, I can but "zero" and "not" back, but same remark as above, why not tell
> that it is about random? This information is missing.

Those things should be a separate patch then, committed separately as
they provide more verbose messages.

>> +   /*
>> +* Note: this section could be removed, as the same
>> functionnality
>> +* is available through \set xxx random_gaussian(...)
>> +*/
>> I think that you are right to do that. That's no fun to break existing
>> scripts, even if people doing that with pgbench are surely experienced
>> hackers.
>
> Ok, but I would like a clear go or vote before doing this.

For now, I am seeing opinions on those matters from nobody else than
me and you, and we got toward the same direction. If you think that
there is a possibility that somebody has a different opinion on those
matters, and it is likely so let's keep the patch as-is then and wait
for more input: it is easier to remove code than add it back. I am not
sure what a committer would say, and it surely would be a waste of
time to just move back and worth for everybody.

>> int() should be strengthened regarding bounds. For example:
>> \set cid debug(int(int(9223372036854775807) +
>> double(9223372036854775807)))
>> debug(script=0,command=1): int -9223372036854775808
>
>
> Hmmm. You mean just to check the double -> int conversion for overflow,
> as in:
>
>   SELECT (9223372036854775807::INT8 +
>   9223372036854775807::DOUBLE PRECISION)::INT8;
>
> Ok.

Yes, that's what I mean. The job running into that should definitely
fail with a proper out-of-bound error message.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-01-14 Thread Fabien COELHO



Hello Michaël,


Hmmm, this is subjective:-)


And dependent on personal opinions and views.


I've decided to stay with the current behavior (\setrandom), that is to
abort the current transaction on errors but not to abort pgbench itself. The
check is done before calling the functions, and I let an assert in the
functions just to be sure. It is going to loop on errors anyway, but this is
what it already does anyway.


OK, yes I see now I missed that during my last review. This has the
disadvantage to double the amount of code dedicated to parameter
checks though :(


Yep, I noticed that obviously, but I envision that "\setrandom" pretty 
unelegant code could go away soon, so this is really just temporary.



But based on the feedback perhaps other folks would feel that it would
be actually worth simply dropping the existing \setrandom command.


Yep, exactly my thoughts. I did not do it because there are two ways: 
actually remove it and be done, or build an equivalent \set at parse time, 
so that would just remove the execution part, but keep some ugly stuff in 
parsing.


I would be in favor of just dropping it.

I won't object to that personally, such pgbench features are something 
for hackers and devs mainly, so I guess that we could survive without a 
deprecation period here.


Yep. I can remove it, but I would like a clear go/vote before doing that.


I can understand that, things like that happen all the time here and
that's not a straight-forward patch that we have here. I am sure that
additional opinions here would be good to have before taking one
decision or another. With the current statu-quo, let's just do what
you think is best.



I let the operators alone and just adds functions management next to it.
I'll merge operators as functions only if it is a blocker.


I think that's a blocker, but I am not the only one here and not a committer.


Ok, I can remove that easily anyway.


- fprintf(stderr, "gaussian parameter must be at least
%f (not \"%s\")\n", MIN_GAUSSIAN_PARAM, argv[5]);
+fprintf(stderr,
+   "random gaussian parameter must be greater than %f "
+"(got %f)\n", MIN_GAUSSIAN_PARAM, parameter);
This looks like useless noise to me. Why changing those messages?


Because the message was not saying that it was about random, and I think
that it is the important.


- if (parameter <= 0.0)
+if (parameter < 0.0)
 {

This bit is false, and triggers an assertion failure when the
exponential parameter is 0.


Oops:-( Sorry.


 fprintf(stderr,
-   "exponential parameter must be greater than
zero (not \"%s\")\n",
-argv[5]);
+  "random exponential parameter must be greater than 0.0 "
+   "(got %f)\n", parameter);
 st->ecnt++;
-return true;
+   return false;
This diff is noise as well, and should be removed.


Ok, I can but "zero" and "not" back, but same remark as above, why not 
tell that it is about random? This information is missing.



+   /*
+* Note: this section could be removed, as the same
functionnality
+* is available through \set xxx random_gaussian(...)
+*/
I think that you are right to do that. That's no fun to break existing
scripts, even if people doing that with pgbench are surely experienced
hackers.


Ok, but I would like a clear go or vote before doing this.


-
   case ENODE_VARIABLE:
Such diffs are noise as well.


Yep.


int() should be strengthened regarding bounds. For example:
\set cid debug(int(int(9223372036854775807) +
double(9223372036854775807)))
debug(script=0,command=1): int -9223372036854775808


Hmmm. You mean just to check the double -> int conversion for overflow,
as in:

  SELECT (9223372036854775807::INT8 +
  9223372036854775807::DOUBLE PRECISION)::INT8;

Ok.

--
Fabien.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 >

1 - 100 of 167 matches

Mail list logo