> At the end of this loop, key[b] contains two copies of the cyclically
> permuted skey next to each other. When building the cache, you scan
> through the bits of val, xor the corresponding keys in if they're set
> and then throw away half of the 32 bits when assigning
> scache->bytes[val] = res;
> 
> So I think you can use "uint16_t keys[NBBY];" and "uint16_t res = 0;",
> replace j < 32 by j < 16 and 31 - j by 15 - j and you'll get the exact
> same result.

In other words, the first nested loop can be simplified to this:

        for (b = 0; b < NBBY; ++b)
                key[b] = skey << b | skey >> (NBSK - b);

and instead of populating the the key[] array up front, you could do:

void
stoeplitz_cache_init(struct stoeplitz_cache *scache, stoeplitz_key skey)
{
        unsigned int b, shift, val;

        /*
         * Cache the results of all possible bit combinations of
         * one byte.
         */
        for (val = 0; val < 256; ++val) {
                uint16_t res = 0;

                for (b = 0; b < NBBY; ++b) {
                        shift = NBBY - b - 1;
                        if (val & (1 << shift))
                                res ^= skey << b | skey >> (NBSK - b);
                }
                scache->bytes[val] = res;
        }
}

Reply via email to