Public bug reported:

Calculate the checksum of data that is 16 byte aligned and a multiple of
16 bytes.

The first step is to reduce it to 1024 bits. We do this in 8 parallel
 chunks in order to mask the latency of the vpmsum instructions. If we
 have more than 32 kB of data to checksum we repeat this step multiple
 times, passing in the previous 1024 bits.

 The next step is to reduce the 1024 bits to 64 bits. This step adds
 32 bits of 0s to the end - this matches what a CRC does. We just
 calculate constants that land the data in this 32 bits.

 We then use fixed point Barrett reduction to compute a mod n over GF(2)
 for n = CRC using POWER8 instructions. We use x = 32.

 http://en.wikipedia.org/wiki/Barrett_reduction

** Affects: zlib (Ubuntu)
     Importance: Undecided
     Assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
         Status: New


** Tags: architecture-ppc64le bugnameltc-136495 severity-low 
targetmilestone-inin1804

** Tags added: architecture-ppc64le bugnameltc-136495 severity-low
targetmilestone-inin1804

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1742941

Title:
  zlib: improve  crc32 performance on P8

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/zlib/+bug/1742941/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to