+ * For This Function: + * Copyright 2015 The Chromium Authors I went and looked at the Chromium source, and found the following snippet that uses the same technique, but only requires 128-bit CLMUL and has a minimum input size of 64 bytes, rather than 256. This seems like it might be better suited for shorter inputs. Also seems much easier than trying to get the AVX-512 hippo to dance. It uses the IEEE polynomial, so would need new constants calculated for ours, but that had to be done for the shared patch, too.
https://github.com/chromium/chromium/blob/main/third_party/zlib/crc32_simd.c#L215 -- John Naylor Amazon Web Services