Am 21.07.22 um 10:41 schrieb Dean Rasheed:

A couple of quick comments on the current patch:

Thank you for your feedback!

It's important to mark these new functions as VOLATILE, not IMMUTABLE,
otherwise they won't work as expected in queries. See
https://www.postgresql.org/docs/current/xfunc-volatility.html

CREATE FUNCTION marks functions as VOLATILE by default if not explicitly marked otherwise. I assumed function definitions in pg_proc.dat have the same behavior. I will fix that.
https://www.postgresql.org/docs/current/sql-createfunction.html

It would be better to use pg_prng_uint64_range() rather than rand() to
pick elements. Partly, that's because it uses a higher quality PRNG,
with a larger internal state, and it ensures that the results are
unbiased across the range. But more importantly, it interoperates with
setseed(), allowing predictable sequences of "random" numbers to be
generated -- something that's useful in writing repeatable regression
tests.

I agree that we should use pg_prng_uint64_range(). However, in order to achieve interoperability with setseed() we would have to use drandom_seed (rather than pg_global_prng_state) as rng state, which is declared statically in float.c and exclusively used by random(). Do we want to expose drandom_seed to other functions?

Assuming these new functions are made to interoperate with setseed(),
which I think they should be, then they also need to be marked as
PARALLEL RESTRICTED, rather than PARALLEL SAFE. See
https://www.postgresql.org/docs/current/parallel-safety.html, which
explains why setseed() and random() are parallel restricted.

As mentioned above, i assumed the default here is PARALLEL UNSAFE. I'll fix that.

In my experience, the requirement for sampling with replacement is
about as common as the requirement for sampling without replacement,
so it seems odd to provide one but not the other. Of course, we can
always add a with-replacement function later, and give it a different
name. But maybe array_sample() could be used for both, with a
"with_replacement" boolean parameter?

We can also add a "with_replacement" boolean parameter which is false by default to array_sample() later. I do not have a strong opinion about that and will implement it, if desired. Any opinions about default/no-default?

Martin


Reply via email to