Hi John, > Further, we verified upthread that Intel's current and near-future product > line > includes server chips (some with over 100 cores, so not exactly low-end) that > don't support AVX-512 at all. I have no idea how common they will be, but they > will certainly be found in cloud datacenters somewhere. Shouldn't we have an > answer for them as well?
Just submitted a patch to improve the SSE4.2 version using the source you referenced. See https://www.postgresql.org/message-id/PH8PR11MB82869FF741DFA4E9A029FF13FBF72%40PH8PR11MB8286.namprd11.prod.outlook.com > I know you had extended time off work, but I've already shared my findings and > explained my reasoning [2]. The title of the paper is "Fast CRC Computation > for > iSCSI Polynomial Using CRC32 Instruction", so unsurprisingly it does improve > the > SSE42 version. With a few dozen lines of code, I can get ~3x speedup on page- > sized inputs. At the very least we want to use this technique on Arm [3], and > the > only blocker now is the question regarding the patents. I'm interested to > hear the > response on this. Still figuring this out. Will respond as soon as I can. Thanks, Raghuveer