Hello Robert,

1. I think there should really be two patches here, the first adding
functions, and the second adding doubles.  Those seem like separate
changes.  And offhand, the double stuff looks a lot less useful that
the function call syntax.

I first submitted the infrastructure part, but I was asked to show how more functions could be included, especially random variants. As random gaussian/exponential functions require a double argument, there must be some support for double values.

Now, it could certainly be split in two patches, but this is rather artificial, IMO.

2. ddebug and idebug seem like a lousy idea to me.

It was really useful to me for debugging and testing.

If we really need to be able to print stuff from pgbench, which I kind of doubt, then maybe we should have a string data type and a print() function, or maybe a string data type and a toplevel \echo command.

The *debug functions allow to intercept the value computed within an expression. If you rely on variables and some echo (which does not exist) this means that there should be double variables as well, which is not currently the case, and which I do not see as useful for the kind of script written for pgbench. Adding the string type is more work, and
I do not see a good use case for those.

So the *debug functions are really just a lightweight solution for debugging type related issues in expressions. I can drop them if this is a blocker, but the are really useful for testing quickly a script.

3. I'm perplexed by why you've replaced evaluateExpr() with evalInt()
and evalDouble().

As explained above in the thread (I think), the reason is that having one overloaded expression evaluation which handles types conversion would produce pretty heavy code, and the two functions with the descending typing allows to have a much smaller code with the same effect.

The issue is that with two types all functions must handle argument type conversion explicitely.

For instance for "random_gaussian(int, int, double)", it may be called with any combination of 3 int/double arguments, each one must be tested and possibly converted to the target type before calling the actual function. For overloaded operators or functions (arithmetics, abs...) there is also the decision about which operator is called and then what conversions are necessary.

With the descending typing and two functions cross recursion all these explicit tests and conversion disappear because the function evaluation calls evalInt or evalDouble depending on the expected types.

Basically, the code is significantly shorter and elegant with this option.

That doesn't seem similar to what I've seen in other expression-evaluation engines.

Probably. This is because I choose a descending typing to simplify the implementation. Changing this would bring no real practical benefit from the usage point of view, but would add significant more verbose and ugly code to test and handle type conversions everywhere, so I'm not keen to do that.

Perhaps I could find out by reading the comments, but actually not, because this entire patch seems to add only one comment:

+       /* reset column count for this scan */

There are a few others, really:-)

While I'm not a fan of excessive commenting, I think a little more
explanation here would be good.

I can certainly add more comments to the code, especially around the eval cross recursion functions.

4. The table of functions in pgbench.sgml seems to leave something to
be desired.  We added a pretty detailed write-up on the Gaussian and
exponential options to \setrandom, but exporand() has only this
description:

Yep. The idea was *not* to replicate the (painful) explanations about random functions, but that it should be shared between the function and the \set variants.

+      <row>
+       <entry><literal><function>exporand(<replaceable>i</>,
<replaceable>j</>, <replaceable>t</>)</></></>
+       <entry>integer</>
+       <entry>exponentially distributed random integer in the bounds,
see below</>
+       <entry><literal>exporand(1, 10, 3.0)</></>
+       <entry>int between <literal>1</> and <literal>10</></>
+      </row>

That's not very helpful.

The table explanation must be kept short for the table format...

Without looking at the example, there's no way to guess what i and j mean, and even with looking at the example, there's no way to guess what t means. If, as I'm guessing, exporand() and guassrand() behave like \setrandom with the exponential and/or Gaussian options, then the documentation for one of those things should contain all of the detailed information and the documentation for the other should refer to it.

Indeed, that was the idea, but it seems that I forgot the pointer:-)

More than likely, exporand() and gaussrand() should get the detailed explanation, and \setrandom should be document as a deprecated alternative to \set ... {gauss,expo,}rand(...)

Ok, the description can be moved to the function part and the \set version reference the other.

5. I liked Heikki's proposed function names random_gaussian(min, max,
threshold) and random_exponential(min, max, threshold) better than the
ones you've picked here.  I think random() would be OK instead of his
suggestion of random_uniform(), though.

Ok.

I'll submit an updated version of the patch, which addresses points 4 & 5 and documents 3, and wait for feedback on the explanations I gave before doing anything for about 1 & 2, as I think that the implied changes are not desirable. I'm not keen at all on changing the cross recursion implementation (3), this would just be pretty ugly code without actual user benefit.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to