The diff below removes some of the unnecessary complications in the
calculation of the stoeplitz_cache and brings them into a form more
suitable for mathematical reasoning. I added a somewhat dense comment
which explains the full construction and which will help justifying
upcoming diffs.

The observations for the code changes are quite simple:

First, scache->bytes[val] is a uint16_t, and it's easy to see that we
only need the lower 16 bits of res in the second nested pair of for
loops.  The values of key[b] are only xored together, to compute res,
so we only need the lower 16 bits of those, too.

Next, looking at the first nested for loop, we see that the values
0..15 of j only touch the top 16 bits of key[b], so we can skip them.
For b = 0, the inner loop for j in 16..31 scans backwards through skey
and sets the corresponding bits of key[b], so we can see key[0] = skey.
A bit of pondering then leads to the general expression:

                key[b] = skey << b | skey >> (NBSK - b);

I renamed the key array into toeplitz_column since it stores columns of
the Toeplitz matrix.  If key is considered better, I won't insist.

It's not very expensive to brute-force verify that scache->bytes[val]
remains the same for all values of val and all values of skey. I did
this on amd64, sparc64 and powerpc.

Index: sys/net/toeplitz.c
===================================================================
RCS file: /var/cvs/src/sys/net/toeplitz.c,v
retrieving revision 1.1
diff -u -p -r1.1 toeplitz.c
--- sys/net/toeplitz.c  16 Jun 2020 04:46:49 -0000      1.1
+++ sys/net/toeplitz.c  16 Jun 2020 15:08:29 -0000
@@ -76,40 +76,37 @@ stoeplitz_init(void)
 
 #define NBSK (NBBY * sizeof(stoeplitz_key))
 
+/*
+ * The Toeplitz hash of a 16-bit number considered as a column vector over
+ * the field with two elements is calculated as a matrix multiplication with
+ * a 16x16 circulant Toeplitz matrix T generated by skey.
+ *
+ * The first eight columns H of T generate the remaining eight columns using
+ * the byteswap operation J = swap16:  T = [H JH].  Thus, the Toeplitz hash of
+ * n = [hi lo] is computed via the formula T * n = (H * hi) ^ swap16(H * lo).
+ *
+ * Therefore the results H * val for all values of a byte are cached in scache.
+ */
 void
 stoeplitz_cache_init(struct stoeplitz_cache *scache, stoeplitz_key skey)
 {
-       uint32_t key[NBBY];
-       unsigned int j, b, shift, val;
+       uint16_t toeplitz_column[NBBY];
+       unsigned int b, shift, val;
 
-       bzero(key, sizeof(key));
+       bzero(toeplitz_column, sizeof(toeplitz_column));
 
-       /*
-        * Calculate 32bit keys for one byte; one key for each bit.
-        */
-       for (b = 0; b < NBBY; ++b) {
-               for (j = 0; j < 32; ++j) {
-                       unsigned int bit;
+       /* Calculate the first eight columns H of the Toeplitz matrix T. */
+       for (b = 0; b < NBBY; ++b)
+               toeplitz_column[b] = skey << b | skey >> (NBSK - b);
 
-                       bit = b + j;
-
-                       shift = NBSK - (bit % NBSK) - 1;
-                       if (skey & (1 << shift))
-                               key[b] |= 1 << (31 - j);
-               }
-       }
-
-       /*
-        * Cache the results of all possible bit combination of
-        * one byte.
-        */
+       /* Cache the results of H * val for all possible values of a byte. */
        for (val = 0; val < 256; ++val) {
-               uint32_t res = 0;
+               uint16_t res = 0;
 
                for (b = 0; b < NBBY; ++b) {
                        shift = NBBY - b - 1;
                        if (val & (1 << shift))
-                               res ^= key[b];
+                               res ^= toeplitz_column[b];
                }
                scache->bytes[val] = res;
        }

Reply via email to