Hi John, Thanks for your summary and here are responses:
> #1 - The choice of AVX-512. There is no such thing as a "CRC instruction > operating > on 8 bytes", and the proposed algorithm is a multistep process using carryless > multiplication and requiring at least 256 bytes of input. The Chromium sources > cited as the source for this patch also contain an implementation using > 128-bit > instructions, and which only requires at least 64 bytes of input. Is there a > reason > that not tested or proposed as well? That would be much easier to > read/maintain, > work on more systems, and might give a speed boost on smaller inputs. These > are > useful properties to have. > > https://github.com/chromium/chromium/blob/main/third_party/zlib/crc32_simd > .c#L215 Agreed. postgres already has the SSE42 version pg_comp_crc32c_sse42, but I didn’t realize it uses the crc32 instruction which processes only 8 bytes at a time. This can certainly be upgraded to process 64bytes at a time and should be faster. Since most of the AVX-512 stuff is almost ready, I propose to do this in a follow up patch immediately. Let me know if you disagree. The AVX512 version processes 256 bytes at a time and will most certainly be faster than the improved SSE42 version, which is why the chromium library has both AVX512 and SSE42. > > #2 - The legal status of the algorithm from following Intel white paper, > which is > missing from its original location, archived here: > > https://web.archive.org/web/20220802143127/https://www.intel.com/content/ > dam/www/public/us/en/documents/white-papers/crc-iscsi-polynomial-crc32- > instruction-paper.pdf > > https://github.com/torvalds/linux/blob/master/arch/x86/crypto/crc32c-pcl-intel- > asm_64.S > > ...so I'm unclear if these patents are applicable to software implementations. > They also seem to be expired, but I am not a lawyer. > Could you look into this please? Even if we do end up with AVX-512, this > would be > a good fallback. Given that SSE42 is pretty much available in all x86 processors at this point, do we need a fallback C version specially after we improve the SSE42 version. Raghuveer