Re: pgbench - add pseudo-random permutation function

Dean Rasheed Tue, 30 Mar 2021 02:55:43 -0700

On Mon, 22 Mar 2021 at 13:43, Dean Rasheed <dean.a.rash...@gmail.com> wrote:
>
> On Sun, 14 Mar 2021 at 16:08, Fabien COELHO <coe...@cri.ensmp.fr> wrote:
> >
> > > My main question on this now is, do you have a scholar reference for
> > > this algorithm?
> >
> > Nope, otherwise I would have put a reference. I'm a scholar though, if
> > it helps:-)
> >
> > I could not find any algorithm that fitted the bill. The usual approach
> > (eg benchmarking designs) is too use some hash function and assume that it
> > is not a permutation, too bad.
> >
> > Basically the algorithm mimics a few rounds of cryptographic encryption
> > adapted to any size and simple operators, whereas encryption function are
> > restricted to power of two blocks (eg the Feistel network). The structure
> > is the same AES with its AddRoundKey the xor-ing stage (adapted to non
> > power of two in whiten above), MixColumns which does the scattering, and
> > for key expansion, I used Donald Knuth generator. Basically one could say
> > that permute is an inexpensive and insecure AES:-)
> >
>
> I spent a little time playing around with this, trying to come up with
> a reasonable way to test it.


I spent more time testing this -- the previous testing was mostly for
large values of size, so I decided to also test it for small sizes.
The theoretical number of possible permutations grows rapidly with
size (size!), and the actual number of permutations observed grew
almost as quickly:

 size  size!    distinct perms
  2     2        2
  3     6        6
  4     24       24
  5     120      120
  6     720      720
  7     5040     5040
  8     40320    24382
  9     362880   181440
  10    3628800  3627355

My test script stopped once the first permutation had been seen 100
times, so it might have seen more permutations had it kept going for
longer.

However, looking at the actual output, there is a very significant
non-uniformity in the probability of particular permutations being
chosen, especially at certain sizes like 8 and 10. For example, at
size = 8, the identity permutation is significantly more likely than
almost any other permutation (roughly 13 times more likely than it
would be by random chance). Additionally, the signs are that this
non-uniformity tends to increase with size. At size = 10, the average
number of occurrences of each permutation in the test was 148, but
there were some that didn't appear at all, many that only appeared 2
or 3 times, and then some that appeared nearly 5000 times.

Also, there appears to be an unfortunate dependence on the seed -- if
the seed is even and the size is a power of 2, it looks like the
number of distinct permutations produced is just size/2, which is a
tiny fraction of size!.

This, together with the results from the previous K-S uniformity tests
at larger sizes, suggests that the current algorithm may not be random
enough to scatter values and remove correlations from a non-uniform
distribution.

In an attempt to do better, I tweaked the algorithm in the attached
patch, which makes use of erand48() to inject more randomness into the
permutation choice. Repeating the tests with the updated algorithm
shows that it has a number of advantages:

1). For small sizes (2-10), each of the size! possible permutations is
now produced with roughly equal probability. No single permutation was
much more likely than expected.

2). The loss of randomness for even seeds is gone.

3). For any given input, the output is more uniformly randomly
distributed for different seeds.

4). For any pair of nearby inputs, the corresponding outputs are more
uniformly randomly separated from one another.

5). The updated algorithm no longer uses modular_multiply(), so it
works the same on all platforms.

I still feel slightly uneasy about inventing our own algorithm here,
but I wasn't able to find anything else suitable, and I've now tested
this quite extensively. All the indications are that this updated
algorithm works well at all sizes and produces sufficiently random
results, though if anyone else can think of other approaches to
testing the core algorithm, that would be useful. For reference, I
attach 2 standalone test C programs I used for testing, which include
copies of the old and new algorithms.

I also did a quick copy editing pass over the docs, and I think the
patch is in pretty good shape. Barring objections, I'll see about
committing it later this week.

Regards,
Dean

diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
new file mode 100644
index 50cf22b..84d9566
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -1057,7 +1057,7 @@ pgbench <optional> <replaceable>options<
 
       <row>
        <entry> <literal>default_seed</literal> </entry>
-       <entry>seed used in hash functions by default</entry>
+       <entry>seed used in hash and pseudorandom permutation functions by default</entry>
       </row>
 
       <row>
@@ -1866,6 +1866,24 @@ SELECT 4 AS four \; SELECT 5 AS five \as
 
       <row>
        <entry role="func_table_entry"><para role="func_signature">
+        <function>permute</function> ( <parameter>i</parameter>, <parameter>size</parameter> [, <parameter>seed</parameter> ] )
+        <returnvalue>integer</returnvalue>
+       </para>
+       <para>
+        Permuted value of <parameter>i</parameter>, in the range
+        <literal>[0, size)</literal>.  This is the new position of
+        <parameter>i</parameter> (modulo <parameter>size</parameter>) in a
+        pseudorandom permutation of the integers <literal>0...size-1</literal>,
+        parameterized by <parameter>seed</parameter>.
+       </para>
+       <para>
+        <literal>permute(0, 4)</literal>
+        <returnvalue>an integer between 0 and 3</returnvalue>
+       </para></entry>
+      </row>
+
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
         <function>pi</function> ()
         <returnvalue>double</returnvalue>
        </para>
@@ -2071,29 +2089,70 @@ f(x) = PHI(2.0 * parameter * (x - mu) /
     </listitem>
    </itemizedlist>
 
+   <note>
+    <para>
+      When designing a benchmark which selects rows non-uniformly, be aware
+      that the rows chosen may be correlated with other data such as IDs from
+      a sequence or the physical row ordering, which may skew performance
+      measurements.
+    </para>
+    <para>
+      To avoid this, you may wish to use the <function>permute</function>
+      function, or some other additional step with similar effect, to shuffle
+      the selected rows and remove such correlations.
+    </para>
+   </note>
+
   <para>
     Hash functions <literal>hash</literal>, <literal>hash_murmur2</literal> and
     <literal>hash_fnv1a</literal> accept an input value and an optional seed parameter.
     In case the seed isn't provided the value of <literal>:default_seed</literal>
     is used, which is initialized randomly unless set by the command-line
-    <literal>-D</literal> option. Hash functions can be used to scatter the
-    distribution of random functions such as <literal>random_zipfian</literal> or
-    <literal>random_exponential</literal>. For instance, the following pgbench
-    script simulates possible real world workload typical for social media and
-    blogging platforms where few accounts generate excessive load:
+    <literal>-D</literal> option.
+  </para>
+
+  <para>
+    <literal>permute</literal> accepts an input value, a size, and an optional
+    seed parameter.  It generates a pseudorandom permutation of integers in
+    the range <literal>[0, size)</literal>, and returns the index of the input
+    value in the permuted values.  The permutation chosen is parameterized by
+    the seed, which defaults to <literal>:default_seed</literal>, if not
+    specified.  Unlike the hash functions, <literal>permute</literal> ensures
+    that there are no collisions or holes in the output values.  Input values
+    outside the interval are interpreted modulo the size.  The function raises
+    an error if the size is not positive.  <function>permute</function> can be
+    used to scatter the distribution of non-uniform random functions such as
+    <literal>random_zipfian</literal> or <literal>random_exponential</literal>
+    so that values drawn more often are not trivially correlated.  For
+    instance, the following <application>pgbench</application> script
+    simulates a possible real world workload typical for social media and
+    blogging platforms where a few accounts generate excessive load:
 
 <programlisting>
-\set r random_zipfian(0, 100000000, 1.07)
-\set k abs(hash(:r)) % 1000000
+\set size 1000000
+\set r random_zipfian(1, :size, 1.07)
+\set k 1 + permute(:r, :size)
 </programlisting>
 
     In some cases several distinct distributions are needed which don't correlate
-    with each other and this is when implicit seed parameter comes in handy:
+    with each other and this is when the optional seed parameter comes in handy:
 
 <programlisting>
-\set k1 abs(hash(:r, :default_seed + 123)) % 1000000
-\set k2 abs(hash(:r, :default_seed + 321)) % 1000000
+\set k1 1 + permute(:r, :size, :default_seed + 123)
+\set k2 1 + permute(:r, :size, :default_seed + 321)
+</programlisting>
+
+    A similar behavior can also be approximated with <function>hash</function>:
+
+<programlisting>
+\set size 1000000
+\set r random_zipfian(1, 100 * :size, 1.07)
+\set k 1 + abs(hash(:r)) % :size
 </programlisting>
+
+    However, since <function>hash</function> generates collisions, some values
+    will not be reachable and others will be more frequent than expected from
+    the original distribution.
   </para>
 
   <para>
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
new file mode 100644
index 4d529ea..56f75cc
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -19,6 +19,7 @@
 #define PGBENCH_NARGS_VARIABLE	(-1)
 #define PGBENCH_NARGS_CASE		(-2)
 #define PGBENCH_NARGS_HASH		(-3)
+#define PGBENCH_NARGS_PERMUTE	(-4)
 
 PgBenchExpr *expr_parse_result;
 
@@ -370,6 +371,9 @@ static const struct
 	{
 		"hash_fnv1a", PGBENCH_NARGS_HASH, PGBENCH_HASH_FNV1A
 	},
+	{
+		"permute", PGBENCH_NARGS_PERMUTE, PGBENCH_PERMUTE
+	},
 	/* keep as last array element */
 	{
 		NULL, 0, 0
@@ -479,6 +483,19 @@ make_func(yyscan_t yyscanner, int fnumbe
 			{
 				PgBenchExpr *var = make_variable("default_seed");
 				args = make_elist(var, args);
+			}
+			break;
+
+		/* pseudorandom permutation function with optional seed argument */
+		case PGBENCH_NARGS_PERMUTE:
+			if (len < 2 || len > 3)
+				expr_yyerror_more(yyscanner, "unexpected number of arguments",
+								  PGBENCH_FUNCTIONS[fnumber].fname);
+
+			if (len == 2)
+			{
+				PgBenchExpr *var = make_variable("default_seed");
+				args = make_elist(var, args);
 			}
 			break;
 
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
new file mode 100644
index 48ce171..e1ac78e
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -1127,6 +1127,127 @@ getHashMurmur2(int64 val, uint64 seed)
 	return (int64) result;
 }
 
+/* return smallest mask holding n  */
+static uint64
+compute_mask(uint64 n)
+{
+	n |= n >> 1;
+	n |= n >> 2;
+	n |= n >> 4;
+	n |= n >> 8;
+	n |= n >> 16;
+	n |= n >> 32;
+	return n;
+}
+
+/*
+ * Pseudorandom permutation function
+ *
+ * For small sizes, this generates each of the (size!) possible permutations
+ * of integers in the range [0, size) with roughly equal probability.  Once
+ * the size is larger than 16, the number of possible permutations exceeds the
+ * number of distinct states of the internal pseudorandom number generator,
+ * and so not all possible permutations can be generated, but the permutations
+ * chosen should continue to give the appearance of being random.
+ *
+ * THIS FUNCTION IS NOT CRYPTOGRAPHICALLY SECURE.
+ * DO NOT USE FOR SUCH PURPOSE.
+ */
+static int64
+permute(const int64 val, const int64 isize, const int64 seed)
+{
+	RandomState random_state;
+	uint64		size,
+				v,
+				mask,
+				top_bit;
+	int			i;
+
+	if (isize < 2)
+		return 0;				/* nothing to permute */
+
+	/* Initialize random state with low-order bits of seed */
+	random_state.xseed[0] = seed & 0xFFFF;
+	random_state.xseed[1] = (seed >> 16) & 0xFFFF;
+	random_state.xseed[2] = (seed >> 32) & 0xFFFF;
+
+	/* Computations are performed on unsigned values */
+	size = (uint64) isize;
+	v = (uint64) val % size;
+
+	/* Mask to work modulo largest power of 2 less than or equal to size */
+	mask = compute_mask(size - 1);
+	if (mask >= size)
+		mask >>= 1;
+
+	/* Topmost bit of mask */
+	top_bit = (mask + 1) >> 1;
+
+	/*
+	 * Permute the input value by applying 4 rounds of pseudorandom bijective
+	 * transformations.  The intention here is to distribute each input
+	 * uniformly randomly across the range, and separate adjacent inputs
+	 * approximately uniformly randomly from each other, leading to a fairly
+	 * random overall choice of permutation.
+	 *
+	 * To separate adjacent inputs, we multiply by a random number modulo
+	 * (mask + 1), which is a power of 2.  For this to be a bijection, the
+	 * multiplier must be odd.  Since this is known to lead to less randomness
+	 * in the lower bits, we also apply a rotation that shifts the topmost bit
+	 * into the least significant bit.  In the special cases where size <= 3,
+	 * mask = 1 and each of these operations is actually a no-op, so we also
+	 * XOR with a different random number to inject additional randomness.
+	 * Since the size is generally not a power of 2, we apply this bijection
+	 * on overlapping upper and lower halves of the input.
+	 *
+	 * To distribute the inputs uniformly across the range, we then also apply
+	 * a random offset modulo the full range.
+	 *
+	 * Taken together, these operations resemble a modified linear
+	 * congruential generator, as is commonly used in pseudorandom number
+	 * generators.  Empirically, it is found that for small sizes it selects
+	 * each of the (size!) possible permutations with roughly equal
+	 * probability.  For larger sizes, not all permutations can be generated,
+	 * but the intended random spread is still produced.
+	 */
+	for (i = 0; i < 4; i++)
+	{
+		uint64		m,
+					r,
+					t;
+
+		/* Random multiply (by an odd number), XOR and rotate of lower half */
+		m = (uint64) (pg_erand48(random_state.xseed) * (mask + 1)) | 1;
+		r = (uint64) (pg_erand48(random_state.xseed) * (mask + 1));
+		if (v <= mask)
+		{
+			v = ((v * m) ^ r) & mask;
+			v = ((v << 1) & mask) | (v & top_bit ? 1 : 0);
+		}
+
+		/* Random offset */
+		r = (uint64) (pg_erand48(random_state.xseed) * size);
+		v = (v + r) % size;
+
+		/* Random multiply (by an odd number), XOR and rotate of upper half */
+		m = (uint64) (pg_erand48(random_state.xseed) * (mask + 1)) | 1;
+		r = (uint64) (pg_erand48(random_state.xseed) * (mask + 1));
+		t = size - 1 - v;
+		if (t <= mask)
+		{
+			t = ((t * m) ^ r) & mask;
+			t = ((t << 1) & mask) | (t & top_bit ? 1 : 0);
+			v = size - 1 - t;
+		}
+
+		/* Random offset */
+		r = (uint64) (pg_erand48(random_state.xseed) * size);
+		v = (v + r) % size;
+	}
+
+	return (int64) v;
+}
+
 /*
  * Initialize the given SimpleStats struct to all zeroes
  */
@@ -2475,6 +2596,29 @@ evalStandardFunc(CState *st,
 				return true;
 			}
 
+		case PGBENCH_PERMUTE:
+			{
+				int64		val,
+							size,
+							seed;
+
+				Assert(nargs == 3);
+
+				if (!coerceToInt(&vargs[0], &val) ||
+					!coerceToInt(&vargs[1], &size) ||
+					!coerceToInt(&vargs[2], &seed))
+					return false;
+
+				if (size <= 0)
+				{
+					pg_log_error("permute size parameter must be greater than zero");
+					return false;
+				}
+
+				setIntValue(retval, permute(val, size, seed));
+				return true;
+			}
+
 		default:
 			/* cannot get here */
 			Assert(0);
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
new file mode 100644
index 3a9d89e..6ce1c98
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -99,7 +99,8 @@ typedef enum PgBenchFunction
 	PGBENCH_IS,
 	PGBENCH_CASE,
 	PGBENCH_HASH_FNV1A,
-	PGBENCH_HASH_MURMUR2
+	PGBENCH_HASH_MURMUR2,
+	PGBENCH_PERMUTE
 } PgBenchFunction;
 
 typedef struct PgBenchExpr PgBenchExpr;
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
new file mode 100644
index 82a46c7..a86a62d
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -4,6 +4,7 @@ use warnings;
 use PostgresNode;
 use TestLib;
 use Test::More;
+use Config;
 
 # start a pgbench specific server
 my $node = get_new_node('main');
@@ -483,6 +484,17 @@ pgbench(
 		qr{command=98.: int 5432\b},                    # :random_seed
 		qr{command=99.: int -9223372036854775808\b},    # min int
 		qr{command=100.: int 9223372036854775807\b},    # max int
+		# pseudorandom permutation tests
+		qr{command=101.: boolean true\b},
+		qr{command=102.: boolean true\b},
+		qr{command=103.: boolean true\b},
+		qr{command=104.: boolean true\b},
+		qr{command=105.: boolean true\b},
+		qr{command=109.: boolean true\b},
+		qr{command=110.: boolean true\b},
+		qr{command=111.: boolean true\b},
+		qr{command=112.: int 9223372036854775797\b},
+		qr{command=113.: boolean true\b},
 	],
 	'pgbench expressions',
 	{
@@ -610,6 +622,33 @@ SELECT :v0, :v1, :v2, :v3;
 -- minint constant parsing
 \set min debug(-9223372036854775808)
 \set max debug(-(:min + 1))
+-- parametric pseudorandom permutation function
+\set t debug(permute(0, 2) + permute(1, 2) = 1)
+\set t debug(permute(0, 3) + permute(1, 3) + permute(2, 3) = 3)
+\set t debug(permute(0, 4) + permute(1, 4) + permute(2, 4) + permute(3, 4) = 6)
+\set t debug(permute(0, 5) + permute(1, 5) + permute(2, 5) + permute(3, 5) + permute(4, 5) = 10)
+\set t debug(permute(0, 16) + permute(1, 16) + permute(2, 16) + permute(3, 16) + \
+             permute(4, 16) + permute(5, 16) + permute(6, 16) + permute(7, 16) + \
+             permute(8, 16) + permute(9, 16) + permute(10, 16) + permute(11, 16) + \
+             permute(12, 16) + permute(13, 16) + permute(14, 16) + permute(15, 16) = 120)
+-- random sanity checks
+\set size random(2, 1000)
+\set v random(0, :size - 1)
+\set p permute(:v, :size)
+\set t debug(0 <= :p and :p < :size and :p = permute(:v + :size, :size) and :p <> permute(:v + 1, :size))
+-- actual values
+\set t debug(permute(:v, 1) = 0)
+\set t debug(permute(0, 2, 123456) = 0 and permute(1, 2, 123456) = 1 and \
+             permute(0, 2, 123457) = 1 and permute(1, 2, 123457) = 0)
+-- 63 bits tests
+\set size debug(:max - 10)
+\set t debug(permute(:size-1, :size, 5432) = 7012042409040117905 and \
+             permute(:size-2, :size, 5432) = 1103575893259321449 and \
+             permute(:size-3, :size, 5432) =  222497678256930915 and \
+             permute(:size-4, :size, 5432) = 8505436630458204228 and \
+             permute(:size-5, :size, 5432) = 1126572291920756793 and \
+             permute(:size-6, :size, 5432) = 7300129793825407053 and \
+             permute(:size-7, :size, 5432) = 2511815335807189023)
 }
 	});
 
@@ -1048,6 +1087,10 @@ SELECT LEAST(} . join(', ', (':i') x 256
 		'bad boolean',                     2,
 		[qr{malformed variable.*trueXXX}], q{\set b :badtrue or true}
 	],
+	[
+		'invalid permute size',				2,
+		[qr{permute size parameter must be greater than zero}], q{\set i permute(0, 0)}
+	],
 
 	# GSET
 	[
diff --git a/src/bin/pgbench/t/002_pgbench_no_server.pl b/src/bin/pgbench/t/002_pgbench_no_server.pl
new file mode 100644
index e38c7d7..4027e68
--- a/src/bin/pgbench/t/002_pgbench_no_server.pl
+++ b/src/bin/pgbench/t/002_pgbench_no_server.pl
@@ -341,6 +341,16 @@ my @script_tests = (
 		'set i',
 		[ qr{set i 1 }, qr{\^ error found here} ],
 		{ 'set_i_op' => "\\set i 1 +\n" }
+	],
+	[
+		'not enough arguments to permute',
+		[qr{unexpected number of arguments \(permute\)}],
+		{ 'bad-permute-1.sql' => "\\set i permute(1)\n" }
+	],
+	[
+		'too many arguments to permute',
+		[qr{unexpected number of arguments \(permute\)}],
+		{ 'bad-permute-2.sql' => "\\set i permute(1, 2, 3, 4)\n" }
 	],);
 
 for my $t (@script_tests)

#include <math.h>
#include <stdio.h>
#include <stdlib.h>

typedef unsigned char uint8;
typedef long int64;
typedef unsigned long uint64;
typedef unsigned __int128 uint128;
typedef int64 (*permute_fn)(const int64, const int64, const int64);

#define PRP_PRIMES 16

static uint64 primes[PRP_PRIMES] = {
    8388617,
    8912921,
    9437189,
    9961487,
    10485767,
    11010059,
    11534351,
    12058679,
    12582917,
    13107229,
    13631489,
    14155777,
    14680067,
    15204391,
    15728681,
    16252967
};

#define PRP_ROUNDS 4

static uint64
compute_mask(uint64 n)
{
    n |= n >> 1;
    n |= n >> 2;
    n |= n >> 4;
    n |= n >> 8;
    n |= n >> 16;
    n |= n >> 32;
    return n;
}

static uint64
modular_multiply(uint64 x, uint64 y, const uint64 m)
{
    return (uint128) x * (uint128) y % (uint128) m;
}

#define DK_LCG_MUL 6364136223846793005L
#define DK_LCG_INC 1442695040888963407L

#define LCG_SHIFT 13

static int64
permute(const int64 data, const int64 isize, const int64 seed)
{
    uint64      size = (uint64) isize;
    uint64      v = (uint64) data % size;
    uint64      key = (uint64) seed;
    uint64      mask = compute_mask(size - 1) >> 1;

    if (isize == 1)
        return 0;

    for (unsigned int i = 0, p = key % PRP_PRIMES;
         i < PRP_ROUNDS; i++, p = (p + 1) % PRP_PRIMES)
    {
        uint64      t;

        key = key * DK_LCG_MUL + DK_LCG_INC;
        if (v <= mask)
            v ^= (key >> LCG_SHIFT) & mask;

        key = key * DK_LCG_MUL + DK_LCG_INC;
        t = size - 1 - v;
        if (t <= mask)
        {
            t ^= (key >> LCG_SHIFT) & mask;
            v = size - 1 - t;
        }

        while (size % primes[p] == 0)
            p = (p + 1) % PRP_PRIMES;

        key = key * DK_LCG_MUL + DK_LCG_INC;

        if ((v & 0xffffffffffL) == v)
            v = (primes[p] * v + (key >> LCG_SHIFT)) % size;
        else
            v = (modular_multiply(primes[p], v, size) +
                (key >> LCG_SHIFT)) % size;
    }

    return (int64) v;
}

static int64
permute2(const int64 data, const int64 isize, const int64 seed)
{
    unsigned short eseed[] = { (seed >> 32) & 0xffff,
                               (seed >> 16) &0xffff,
                               seed & 0xffff };
    uint64      size = (uint64) isize;
    uint64      v = (uint64) data % size;
    uint64      mask;
    uint64      top_bit;
    int         i;

    if (isize == 1)
        return 0;

    // This choice of mask satisfies size/2 <= mask <= size-1
    mask = compute_mask(size - 1);
    if (mask >= size) mask >>= 1;

    // Most significant bit of mask
    top_bit = (mask + 1) >> 1;

    for (i = 0; i < 4; i++)
    {
        uint64 m;
        uint64 r;
        uint64 t;

        m = (uint64) (erand48(eseed) * (mask + 1)) | 1;
        r = (uint64) (erand48(eseed) * (mask + 1));
        if (v <= mask)
        {
          v = ((v * m) ^ r) & mask;
          v = ((v << 1) & mask) | (v & top_bit ? 1 : 0);
        }

        r = (uint64) (erand48(eseed) * size);
        v = (v + r) % size;

        m = (uint64) (erand48(eseed) * (mask + 1)) | 1;
        r = (uint64) (erand48(eseed) * (mask + 1));
        t = size - 1 - v;
        if (t <= mask)
        {
          t = ((t * m) ^ r) & mask;
          t = ((t << 1) & mask) | (t & top_bit ? 1 : 0);
          v = size - 1 - t;
        }

        r = (uint64) (erand48(eseed) * size);
        v = (v + r) % size;
    }

    return (int64) v;
}

static int int64_cmp(const void *a, const void *b)
{
  int64 x = *((int64 *) a);
  int64 y = *((int64 *) b);
  return x - y;
}

int main()
{
  permute_fn permute_fn = &permute2;
  unsigned short seed[3] = { 1234, 5678, 9012 };
  int s[] = { 1000, 1001, 1002, 1020, 1021, 1022, 1023, 1024, 1025, 1026,
              2000, 3000, 3900, 3950, 4000, 4090, 4092, 4094, 4095, 4096,
              4097, 4098, 4100, 4200, 4300, 4400, 4500, 5000, 6000, 7000,
              8000, 9000, 9973, 10000, 10001, 10005, 10006, 10007,
              (1<<14)-3, (1<<14)-2, (1<<14)-1, 1<<14, (1<<14)+1, (1<<14)+2,
              (1<<15)-(1<<14)-1, (1<<15)-(1<<14), (1<<15)-1, 1<<15,
              (1<<15)+1, (1<<15)+(1<<14)-1, (1<<15)+(1<<14), (1<<15)+(1<<14)+1,
              1<<16, 1<<17, 1<<18, 1<<19, 1<<20, 1<<21, 1<<22, -1 };
  int x;
  int y;
  int64 size;
  int64 seed64;
  int i;

  for (x = 0, size = s[x]; size > 0; size = s[++x])
  {
    for (y = 1; y <= 4; y++)
    {
      uint64 v1 = (uint64) (erand48(seed) * size);
      uint64 v2 = (v1 + y) % size;
      int N = 10000 + size * 2;
      int64 *rvals = malloc(N * sizeof(int64));
      int64 *pvals = malloc(N * sizeof(int64));
      int64 *dvals = malloc(N * sizeof(int64));
      double Dr = 0;
      double Dp = 0;
      double Dd = 0;
      double Kr;
      double Kp;
      double Kd;
      double D_alpha;

      // Check that we really get a permuation
      seed64 = erand48(seed) * (1L<<48);
      for (i = 0; i < size; i++)
      {
        pvals[i] = (*permute_fn)(i, size, seed64);
      }
      qsort(pvals, size, sizeof(int64), int64_cmp);
      for (i = 0; i < size; i++)
      {
        if (pvals[i] != i)
        {
          printf("permute() failed (size=%ld, seed=%ld)\n", size, seed64);
          for (int j = 0; j < size && j < 100; j++) printf(" %ld", pvals[j]);
          printf("\npvals[%ld] = %ld\n", i, pvals[i]);
          return 1;
        }
      }

      // Test how uniformly random it is with a bunch of different seeds
      for (i = 0; i < N; i++)
      {
        rvals[i] = (int64) (erand48(seed) * size);

        seed64 = erand48(seed) * (1L<<48);
        pvals[i] = (*permute_fn)(v1, size, seed64);
        dvals[i] = pvals[i] - (*permute_fn)(v2, size, seed64);
        if (dvals[i] < 0) dvals[i] += size;
      }
      qsort(rvals, N, sizeof(int64), int64_cmp);
      qsort(pvals, N, sizeof(int64), int64_cmp);
      qsort(dvals, N, sizeof(int64), int64_cmp);
/*
if (size == (1<<10) && y == 1) {
  int64 last=pvals[0];
  int c=1;
  printf("dvals:\n");
  for (i = 1; i < N; i++)
    if (dvals[i] != last) {
      printf("%ld,%d\n", last, c);
      last = dvals[i];
      c=1;
    } else c++;
  printf("%ld,%d\n", last, c);
}
*/
      // K-S test for uniformity
      for (i = 1; i <= N; i++)
      {
        double D;

        D = (double) i / N - (double) rvals[i-1] / size;
        if (D > Dr) Dr = D;
        D = (double) rvals[i-1] / size - (double) (i-1) / N;
        if (D > Dr) Dr = D;

        D = (double) i / N - (double) pvals[i-1] / size;
        if (D > Dp) Dp = D;
        D = (double) pvals[i-1] / size - (double) (i-1) / N;
        if (D > Dp) Dp = D;

        D = (double) i / N - (double) dvals[i-1] / size;
        if (D > Dd) Dd = D;
        D = (double) dvals[i-1] / size - (double) (i-1) / N;
        if (D > Dd) Dd = D;
      }
      free(rvals);
      free(pvals);
      free(dvals);

      Kr = Dr * sqrt(N);
      Kp = Dp * sqrt(N);
      Kd = Dd * sqrt(N);

      // Critical value by confidence level
      //   0.001  1.94947
      //   0.01   1.62762
      //   0.02   1.51743
      //   0.05   1.35810
      //   0.1    1.22385
      //   0.15   1.13795
      //   0.2    1.07275
      D_alpha = 1.94947 / sqrt(N);

      printf("size=%ld, v1=%ld, v2=%ld, N=%d:\n"
             "  Dr=%f, Kr=%f %s\n"
             "  Dp=%f, Kp=%f %s\n"
             "  Dd=%f, Kd=%f %s\n",
             size, v1, v2, N,
             Dr, Kr, Dr > D_alpha ? "non-uniform" : "uniform",
             Dp, Kp, Dp > D_alpha ? "non-uniform" : "uniform",
             Dd, Kd, Dd > D_alpha ? "non-uniform" : "uniform");
    }
  }

  return 0;
}

#include <math.h>
#include <stdio.h>
#include <stdlib.h>

typedef unsigned char uint8;
typedef long int64;
typedef unsigned long uint64;
typedef unsigned __int128 uint128;
typedef int64 (*permute_fn)(const int64, const int64, const int64);

#define PRP_PRIMES 16

static uint64 primes[PRP_PRIMES] = {
    8388617,
    8912921,
    9437189,
    9961487,
    10485767,
    11010059,
    11534351,
    12058679,
    12582917,
    13107229,
    13631489,
    14155777,
    14680067,
    15204391,
    15728681,
    16252967
};

#define PRP_ROUNDS 4

static uint64
compute_mask(uint64 n)
{
    n |= n >> 1;
    n |= n >> 2;
    n |= n >> 4;
    n |= n >> 8;
    n |= n >> 16;
    n |= n >> 32;
    return n;
}

static uint64
modular_multiply(uint64 x, uint64 y, const uint64 m)
{
    return (uint128) x * (uint128) y % (uint128) m;
}

#define DK_LCG_MUL 6364136223846793005L
#define DK_LCG_INC 1442695040888963407L

#define LCG_SHIFT 13

static int64
permute(const int64 data, const int64 isize, const int64 seed)
{
    uint64      size = (uint64) isize;
    uint64      v = (uint64) data % size;
    uint64      key = (uint64) seed;
    uint64      mask = compute_mask(size - 1) >> 1;

    if (isize == 1)
        return 0;

    for (unsigned int i = 0, p = key % PRP_PRIMES;
         i < PRP_ROUNDS; i++, p = (p + 1) % PRP_PRIMES)
    {
        uint64      t;

        key = key * DK_LCG_MUL + DK_LCG_INC;
        if (v <= mask)
            v ^= (key >> LCG_SHIFT) & mask;

        key = key * DK_LCG_MUL + DK_LCG_INC;
        t = size - 1 - v;
        if (t <= mask)
        {
            t ^= (key >> LCG_SHIFT) & mask;
            v = size - 1 - t;
        }

        while (size % primes[p] == 0)
            p = (p + 1) % PRP_PRIMES;

        key = key * DK_LCG_MUL + DK_LCG_INC;

        if ((v & 0xffffffffffL) == v)
            v = (primes[p] * v + (key >> LCG_SHIFT)) % size;
        else
            v = (modular_multiply(primes[p], v, size) +
                (key >> LCG_SHIFT)) % size;
    }

    return (int64) v;
}

static int64
permute2(const int64 data, const int64 isize, const int64 seed)
{
    unsigned short eseed[] = { (seed >> 32) & 0xffff,
                               (seed >> 16) &0xffff,
                               seed & 0xffff };
    uint64      size = (uint64) isize;
    uint64      v = (uint64) data % size;
    uint64      mask;
    uint64      top_bit;
    int         i;

    if (isize == 1)
        return 0;

    // This choice of mask satisfies size/2 <= mask <= size-1
    mask = compute_mask(size - 1);
    if (mask >= size) mask >>= 1;

    // Most significant bit of mask
    top_bit = (mask + 1) >> 1;

    for (i = 0; i < 4; i++)
    {
        uint64 m;
        uint64 r;
        uint64 t;

        m = (uint64) (erand48(eseed) * (mask + 1)) | 1;
        r = (uint64) (erand48(eseed) * (mask + 1));
        if (v <= mask)
        {
          v = ((v * m) ^ r) & mask;
          v = ((v << 1) & mask) | (v & top_bit ? 1 : 0);
        }

        r = (uint64) (erand48(eseed) * size);
        v = (v + r) % size;

        m = (uint64) (erand48(eseed) * (mask + 1)) | 1;
        r = (uint64) (erand48(eseed) * (mask + 1));
        t = size - 1 - v;
        if (t <= mask)
        {
          t = ((t * m) ^ r) & mask;
          t = ((t << 1) & mask) | (t & top_bit ? 1 : 0);
          v = size - 1 - t;
        }

        r = (uint64) (erand48(eseed) * size);
        v = (v + r) % size;
    }

    return (int64) v;
}

static uint64 perm_to_int(int64 *pvals, int64 size)
{
  int64 f = 1;
  int64 N = 0;
  int64 i;

  for (i = 1; i < size; i++)
  {
    N += pvals[i] * f;
    f *= size;
  }
  return N;
}

int main()
{
  permute_fn permute_fn = &permute2;
  unsigned short seed[3] = { 1234, 5678, 9012 };
  int s[] = { 2, 3, 4, 5, 6, 7, 8, 9, -1 };
  int x;
  int y;
  int64 size;
  int64 seed64;
  int i;

  for (x = 0, size = s[x]; size > 0; size = s[++x])
  {
    int64 *pvals = malloc(size * sizeof(int64));
    uint64 id = -1;
    int c = 0;

    for (y = 0; ; y++)
    {
      seed64 = erand48(seed) * (1L<<48);
      for (i = 0; i < size; i++)
        pvals[i] = (*permute_fn)(i, size, seed64);
      printf("size=%d, p=[", size);
      for (i = 0; i < size; i++)
        printf(" %ld", pvals[i]);
      printf(" ]\n");

      if (id == -1) id = perm_to_int(pvals, size);
      else if (perm_to_int(pvals, size) == id && ++c == 100) break;
    }
  }

  return 0;
}

Re: pgbench - add pseudo-random permutation function

Reply via email to