Version one is "k' = 1 + (a * k + b) modulo n" with "a" prime with respect to "n", "n" being the number of keys. This is nearly possible, but for the modulo operator which is currently missing, and that I'm planning to submit for this very reason, but probably another time.

That's pretty crude,

Yep. It is very simple, it is much better than nothing, and for a database test is may be "good enough".

although I don't object to a modulo operator. It would be nice to be able to use a truly random permutation, which is not hard to generate but probably requires O(n) storage, likely a problem for large scale factors.

That is indeed the actual issue in my mind. I was thinking of permutations with a formula, which are not so easy to find and may end-up looking like "(a*k+b)%n" anyway. I had the same issue for generating random data for a schema (see http://www.coelho.net/datafiller.html).

Maybe somebody who knows more math than I do (like you, probably!) can come up with something more clever.

I can certainly suggest other formula, but that does not mean beautiful code, thus would probably be rejected. I'll see.

An alternative to this whole process may be to hash/modulo a non uniform random value.

      id = 1 + hash(some-random()) % n

But the hashing changes the distribution as it adds collisions, so I have to think about how to be able to control the distribution in that case, and what hash function to use.


Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to