From: Linuxppc-dev Daniel Axtens
> Sent: 15 March 2017 12:38
> The core nuts and bolts of the crc32c vpmsum algorithm will
> also work for a number of other CRC algorithms with different
> polynomials. Factor out the function into a new asm file.
> To handle multiple users of the function, a user simply
> provides constants, defines the name of their CRC function,
> and then #includes the core algorithm file.
While not part of this change, the unrolled loops look as though
they just destroy the cpu cache.
I'd like be convinced that anything does CRC over long enough buffers
to make it a gain at all.
With modern (not that modern now) superscalar cpus you can often
get the loop instructions 'for free'.
Sometimes pipelining the loop is needed to get full throughput.
Unlike the IP checksum, you don't even have to 'loop carry' the
cpu carry flag.