Public bug reported: Calculate the checksum of data that is 16 byte aligned and a multiple of 16 bytes.
The first step is to reduce it to 1024 bits. We do this in 8 parallel chunks in order to mask the latency of the vpmsum instructions. If we have more than 32 kB of data to checksum we repeat this step multiple times, passing in the previous 1024 bits. The next step is to reduce the 1024 bits to 64 bits. This step adds 32 bits of 0s to the end - this matches what a CRC does. We just calculate constants that land the data in this 32 bits. We then use fixed point Barrett reduction to compute a mod n over GF(2) for n = CRC using POWER8 instructions. We use x = 32. http://en.wikipedia.org/wiki/Barrett_reduction ** Affects: zlib (Ubuntu) Importance: Undecided Assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) Status: New ** Tags: architecture-ppc64le bugnameltc-136495 severity-low targetmilestone-inin1804 ** Tags added: architecture-ppc64le bugnameltc-136495 severity-low targetmilestone-inin1804 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1742941 Title: zlib: improve crc32 performance on P8 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/zlib/+bug/1742941/+subscriptions -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
