On Thu, Jun 18, 2020 at 6:58 PM Maamoun TK <[email protected]>
wrote:

> I added a PowerPC64LE optimized version of AES and GHASH to nettle.
> Patch summary:
>
>  GHASH Algorithm
>
> I took the advantage of several references and researches to achieve the
> high-speed implementation of this algorithm. These references include
> several techniques that have been used to improve the performance of the
> algorithm, I will summarize the important techniques used as follows:
>
>    - The main equation: The main equation for 4 block (128-bit each) can
>    be seen in reference [1]  Digest =
>    (((((((Digest⊕C0)*H)⊕C1)*H)⊕C2)*H)⊕C3)*H =
>    ((Digest⊕C0)*H4)⊕(C1*H3)⊕(C2*H2)⊕(C3*H) to achieve more parallelism,
>    this equation can be modified to address 8 blocks per one loop. It looks
>    like as follows Digest =
>    ((Digest⊕C0)*H8)⊕(C1*H7)⊕(C2*H6)⊕(C3*H5)⊕(C4*H4)⊕(C5*H3)⊕(C6*H2)⊕(C7*H)
>    - Handling Bit-reflection of the multiplication product [1]: This
>    technique moves part of the workload inside the loop to the init function
>    so it is executed only once.
>    - Karatsuba Algorithm: This algorithm allows to perform three
>    multiplication instructions instead of four, in exchange for two additional
>    Xor. This technique is well explained with figures in reference [1]
>    - Deferred Recombination of partial products This technique is well
>    explained with figures in reference [1]
>    - Multiplication-based reduction: I tested both classical shift-based
>    reduction and multiplication-based reduction, the multiplication-based
>    reduction achieved better performance and less instructions. Example of
>    both reductions can be seen in reference [2]
>
>  AES
>    Power ISA makes it easy to optimize AES by offering built-in AES
> instructions.
>
> AES-GCM performance (Tested on POWER9):
>
>    - GCM_AES Encrypt ~x13.5 of nettle C implementation
>    - GCM_AES Decrypt ~x13.5 of nettle C implementation
>    - GCM_AES Update (Only GHASH is called) ~x26 of nettle C implementation
>
> Notes:
>
>    - Test 128 bytes is added to gcm-test in testsuite to test 8x loop in
>    GHASH optimized function.
>    - Since the functionality of gcm_set_key() is replaced with
>    gcm_init_key() for PowerPC64LE, two warnings will pop up: [‘gcm_gf_shift’
>    defined but not used] and [‘gcm_gf_add’ defined but not used]
>
>  References: [1]
> https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/communications-ia-galois-counter-mode-paper.pdf
>  [2]
> https://www.intel.com/content/dam/www/public/us/en/documents/software-support/enabling-high-performance-gcm.pdf
>  [3] https://software.intel.com/file/24918 [4]
> https://github.com/dot-asm/cryptogams
>
_______________________________________________
nettle-bugs mailing list
[email protected]
http://lists.lysator.liu.se/mailman/listinfo/nettle-bugs

Reply via email to