Version one is "k' = 1 + (a * k + b) modulo n" with "a" prime with
respect to "n", "n" being the number of keys. This is nearly possible,
but for the modulo operator which is currently missing, and that I'm
planning to submit for this very reason, but probably another time.
That's pretty crude,
Yep. It is very simple, it is much better than nothing, and for a database
test is may be "good enough".
although I don't object to a modulo operator. It would be nice to be
able to use a truly random permutation, which is not hard to generate
but probably requires O(n) storage, likely a problem for large scale
That is indeed the actual issue in my mind. I was thinking of permutations
with a formula, which are not so easy to find and may end-up looking like
"(a*k+b)%n" anyway. I had the same issue for generating random data for a
schema (see http://www.coelho.net/datafiller.html).
Maybe somebody who knows more math than I do (like you, probably!) can
come up with something more clever.
I can certainly suggest other formula, but that does not mean beautiful
code, thus would probably be rejected. I'll see.
An alternative to this whole process may be to hash/modulo a non uniform
id = 1 + hash(some-random()) % n
But the hashing changes the distribution as it adds collisions, so I have
to think about how to be able to control the distribution in that case,
and what hash function to use.
Sent via pgsql-hackers mailing list (email@example.com)
To make changes to your subscription: