Re: [PATCH 0/2] Add optimized powerpc64 assembly for SHA2
Eric Richter writes: > This set introduces an optimized powerpc64 assembly implementation for > SHA256 and SHA512. This have been derived from BSD-2-Clause licensed > code authored by IBM, originally released in the IBM POWER > Cryptography Reference Implementation project[1], modified to work in > Nettle, contributed under the GPL license. > > Development of this new implementation targetted POWER 10, however > supports the POWER 8 ISA and above. The following commits provide the > performance data I recorded on POWER 10, though similar improvements can > be found on P8/P9. Thanks, I've had a first quick look. Nice speedup, and it looks pretty good. I wasn't aware of the vshasigma instructions. One comment on the Nettle ppc conventions: I prefer to use register names rather than just register numbers; that helps me avoid some confusion when some instructions take v1 registers and others take vs1 registers. Preferably by configuring with ASM_FLAGS=-mregnames during development. For assemblers that don't like register names (seems to be the default), machine.m4 arranges for translation from v1 --> 1, etc. > As an aside: I have tested this patch set on POWER 8 and POWER 10 > hardware running little-endian linux distributions, however I have not > yet been able to test on a big-endian distro. I can confirm however that > the original source in IPCRI does compile and pass tests for both little > and big endian via qemu-user, so spare human error in deriving the > version for Nettle, it is expected to be functional. There are big-endian tests in the ci pipeline (hosted on the mirror repo at https://gitlab.com/gnutls/nettle), using cross-compiling + qemu-user. And I also have a similar setup locally. Regards, /Niels -- Niels Möller. PGP key CB4962D070D77D7FCB8BA36271D8F1FF368C6677. Internet email is subject to wholesale government surveillance. ___ nettle-bugs mailing list -- nettle-bugs@lists.lysator.liu.se To unsubscribe send an email to nettle-bugs-le...@lists.lysator.liu.se
Re: Naming for names in struct nettle_hash
Niels Möller writes: > Hi, I've got a bug report that sha512_224 and sha512_256 are missing in > the list returned by nettle_get_hashes, and I'm about to add them. > > But then there's a question of naming convention. Currently, the > > extern const struct nettle_hash nettle_sha512_256; > > includes a name field set to the string "sha512-256", which is somewhat > inconsistent with, e.g., the struct nettle_sha3_256 which includes the > name "sha3_256". > > Should I just change this (patch below)? I've decided to changing those names to use underscore. Regards, /Niels -- Niels Möller. PGP key CB4962D070D77D7FCB8BA36271D8F1FF368C6677. Internet email is subject to wholesale government surveillance. ___ nettle-bugs mailing list -- nettle-bugs@lists.lysator.liu.se To unsubscribe send an email to nettle-bugs-le...@lists.lysator.liu.se
Re: additional API for SHAKE streaming read
Daiki Ueno writes: > Yes, that looks good to me, except _nettle_sha3_shake has a > copy-and-paste error where SHA3_256_BLOCK_SIZE is hard-coded. Thanks, good catch. >> 1. Decide what should be renamed sha3_shake256_* > > I guess we can live with the existing interface. For SHAKE128, we could > only provide sha3_128_init, sha3_128_update, and > sha3_128_shake{,_output}, without sha3_128_digest. Sounds good to me. >> 2. Implement shake128. > > I've extracted it from the ML-KEM merge request and put it here: > https://git.lysator.liu.se/nettle/nettle/-/merge_requests/63 > > Not sending via email as it includes a huge test vector. Thanks, merged to the sha3-shake-updates branch. Sorry if you didn't intend me to do that right away (I noticed some minor problems after merge, which I've fixed). I'd like to merge to master after ci runs have completed. >> 3. Update docs. > > I can do that once we settle the interface. Excellent. To me, interface in sha3.h now looks good. Regards, /Niels -- Niels Möller. PGP key CB4962D070D77D7FCB8BA36271D8F1FF368C6677. Internet email is subject to wholesale government surveillance. ___ nettle-bugs mailing list -- nettle-bugs@lists.lysator.liu.se To unsubscribe send an email to nettle-bugs-le...@lists.lysator.liu.se
[PATCH 1/2] powerpc64: Add optimized assembly for sha256-compress-n
This patch introduces an optimized powerpc64 assembly implementation for sha256-compress-n. This takes advantage of the vshasigma instruction, as well as unrolling loops to best take advantage of running instructions in parallel. The following data was captured on a POWER 10 LPAR @ ~3.896GHz Current C implementation: Algorithm mode Mbyte/s sha256 update 280.97 hmac-sha256 64 bytes 80.81 hmac-sha256256 bytes 170.50 hmac-sha256 1024 bytes 241.92 hmac-sha256 4096 bytes 268.54 hmac-sha256 single msg 276.16 With optimized assembly: Algorithm mode Mbyte/s sha256 update 446.42 hmac-sha256 64 bytes 124.89 hmac-sha256256 bytes 268.90 hmac-sha256 1024 bytes 382.06 hmac-sha256 4096 bytes 425.38 hmac-sha256 single msg 439.75 Signed-off-by: Eric Richter --- fat-ppc.c | 12 + powerpc64/fat/sha256-compress-n-2.asm | 36 +++ powerpc64/p8/sha256-compress-n.asm| 339 ++ 3 files changed, 387 insertions(+) create mode 100644 powerpc64/fat/sha256-compress-n-2.asm create mode 100644 powerpc64/p8/sha256-compress-n.asm diff --git a/fat-ppc.c b/fat-ppc.c index cd76f7a1..efbeb2ec 100644 --- a/fat-ppc.c +++ b/fat-ppc.c @@ -203,6 +203,10 @@ DECLARE_FAT_FUNC(_nettle_poly1305_blocks, poly1305_blocks_func) DECLARE_FAT_FUNC_VAR(poly1305_blocks, poly1305_blocks_func, c) DECLARE_FAT_FUNC_VAR(poly1305_blocks, poly1305_blocks_func, ppc64) +DECLARE_FAT_FUNC(_nettle_sha256_compress_n, sha256_compress_n_func) +DECLARE_FAT_FUNC_VAR(sha256_compress_n, sha256_compress_n_func, c) +DECLARE_FAT_FUNC_VAR(sha256_compress_n, sha256_compress_n_func, ppc64) + static void CONSTRUCTOR fat_init (void) @@ -231,6 +235,8 @@ fat_init (void) _nettle_ghash_update_arm64() */ _nettle_ghash_set_key_vec = _nettle_ghash_set_key_ppc64; _nettle_ghash_update_vec = _nettle_ghash_update_ppc64; + + _nettle_sha256_compress_n_vec = _nettle_sha256_compress_n_ppc64; } else { @@ -239,6 +245,7 @@ fat_init (void) _nettle_aes_invert_vec = _nettle_aes_invert_c; _nettle_ghash_set_key_vec = _nettle_ghash_set_key_c; _nettle_ghash_update_vec = _nettle_ghash_update_c; + _nettle_sha256_compress_n_vec = _nettle_sha256_compress_n_c; } if (features.have_altivec) { @@ -338,3 +345,8 @@ DEFINE_FAT_FUNC(_nettle_poly1305_blocks, const uint8_t *, size_t blocks, const uint8_t *m), (ctx, blocks, m)) + +DEFINE_FAT_FUNC(_nettle_sha256_compress_n, const uint8_t *, + (uint32_t *state, const uint32_t *k, +size_t blocks, const uint8_t *input), + (state, k, blocks, input)) diff --git a/powerpc64/fat/sha256-compress-n-2.asm b/powerpc64/fat/sha256-compress-n-2.asm new file mode 100644 index ..4f4eee9d --- /dev/null +++ b/powerpc64/fat/sha256-compress-n-2.asm @@ -0,0 +1,36 @@ +C powerpc64/fat/sha256-compress-n-2.asm + +ifelse(` + Copyright (C) 2024 Eric Richter, IBM Corporation + + This file is part of GNU Nettle. + + GNU Nettle is free software: you can redistribute it and/or + modify it under the terms of either: + + * the GNU Lesser General Public License as published by the Free + Software Foundation; either version 3 of the License, or (at your + option) any later version. + + or + + * the GNU General Public License as published by the Free + Software Foundation; either version 2 of the License, or (at your + option) any later version. + + or both in parallel, as here. + + GNU Nettle is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received copies of the GNU General Public License and + the GNU Lesser General Public License along with this program. If + not, see http://www.gnu.org/licenses/. +') + +dnl PROLOGUE(_nettle_sha256_compress_n) picked up by configure + +define(`fat_transform', `$1_ppc64') +include_src(`powerpc64/p8/sha256-compress-n.asm') diff --git a/powerpc64/p8/sha256-compress-n.asm b/powerpc64/p8/sha256-compress-n.asm new file mode 100644 index ..52f548dc --- /dev/null +++ b/powerpc64/p8/sha256-compress-n.asm @@ -0,0 +1,339 @@ +C x86_64/sha256-compress-n.asm + +ifelse(` + Copyright (C) 2024 Eric Richter, IBM Corporation + + This file is part of GNU Nettle. + + GNU Nettle is free software: you can redistribute it and/or + modify it under the terms of either: + + * the GNU Lesser General Public License as published by the Free + Software Foundation; either version 3 of the License, or (at your + option) any later version. + + or + + * the GNU General Public License as published by
[PATCH 0/2] Add optimized powerpc64 assembly for SHA2
This set introduces an optimized powerpc64 assembly implementation for SHA256 and SHA512. This have been derived from BSD-2-Clause licensed code authored by IBM, originally released in the IBM POWER Cryptography Reference Implementation project[1], modified to work in Nettle, contributed under the GPL license. Development of this new implementation targetted POWER 10, however supports the POWER 8 ISA and above. The following commits provide the performance data I recorded on POWER 10, though similar improvements can be found on P8/P9. As an aside: I have tested this patch set on POWER 8 and POWER 10 hardware running little-endian linux distributions, however I have not yet been able to test on a big-endian distro. I can confirm however that the original source in IPCRI does compile and pass tests for both little and big endian via qemu-user, so spare human error in deriving the version for Nettle, it is expected to be functional. [1] https://github.com/ibm/ipcri Eric Richter (2): powerpc64: Add optimized assembly for sha256-compress-n powerpc64: Add optimized assembly for sha512-compress-n powerpc64/p8/sha256-compress-n.asm | 339 powerpc64/p8/sha512-compress.asm | 345 + 2 files changed, 684 insertions(+) create mode 100644 powerpc64/p8/sha256-compress-n.asm create mode 100644 powerpc64/p8/sha512-compress.asm -- 2.43.0 ___ nettle-bugs mailing list -- nettle-bugs@lists.lysator.liu.se To unsubscribe send an email to nettle-bugs-le...@lists.lysator.liu.se
[PATCH 2/2] powerpc64: Add optimized assembly for sha512-compress-n
This patch introduces an optimized powerpc64 assembly implementation for sha512-compress, derived from the implementation for sha256-compress-n. The following data was captured on a POWER 10 LPAR @ ~3.896GHz Current C implementation: Algorithm mode Mbyte/s sha512 update 447.02 sha512-224 update 444.30 sha512-256 update 445.02 hmac-sha512 64 bytes 97.27 hmac-sha512256 bytes 204.55 hmac-sha512 1024 bytes 342.86 hmac-sha512 4096 bytes 409.57 hmac-sha512 single msg 433.95 With optimized assembly: Algorithm mode Mbyte/s sha512 update 705.36 sha512-224 update 705.63 sha512-256 update 705.34 hmac-sha512 64 bytes 141.66 hmac-sha512256 bytes 310.26 hmac-sha512 1024 bytes 534.22 hmac-sha512 4096 bytes 641.74 hmac-sha512 single msg 677.14 Signed-off-by: Eric Richter --- fat-ppc.c | 10 + powerpc64/fat/sha512-compress-2.asm | 36 +++ powerpc64/p8/sha512-compress.asm| 345 3 files changed, 391 insertions(+) create mode 100644 powerpc64/fat/sha512-compress-2.asm create mode 100644 powerpc64/p8/sha512-compress.asm diff --git a/fat-ppc.c b/fat-ppc.c index efbeb2ec..a228386a 100644 --- a/fat-ppc.c +++ b/fat-ppc.c @@ -207,6 +207,10 @@ DECLARE_FAT_FUNC(_nettle_sha256_compress_n, sha256_compress_n_func) DECLARE_FAT_FUNC_VAR(sha256_compress_n, sha256_compress_n_func, c) DECLARE_FAT_FUNC_VAR(sha256_compress_n, sha256_compress_n_func, ppc64) +DECLARE_FAT_FUNC(_nettle_sha512_compress, sha512_compress_func) +DECLARE_FAT_FUNC_VAR(sha512_compress, sha512_compress_func, c) +DECLARE_FAT_FUNC_VAR(sha512_compress, sha512_compress_func, ppc64) + static void CONSTRUCTOR fat_init (void) @@ -237,6 +241,7 @@ fat_init (void) _nettle_ghash_update_vec = _nettle_ghash_update_ppc64; _nettle_sha256_compress_n_vec = _nettle_sha256_compress_n_ppc64; + _nettle_sha512_compress_vec = _nettle_sha512_compress_ppc64; } else { @@ -246,6 +251,7 @@ fat_init (void) _nettle_ghash_set_key_vec = _nettle_ghash_set_key_c; _nettle_ghash_update_vec = _nettle_ghash_update_c; _nettle_sha256_compress_n_vec = _nettle_sha256_compress_n_c; + _nettle_sha512_compress_vec = _nettle_sha512_compress_c; } if (features.have_altivec) { @@ -350,3 +356,7 @@ DEFINE_FAT_FUNC(_nettle_sha256_compress_n, const uint8_t *, (uint32_t *state, const uint32_t *k, size_t blocks, const uint8_t *input), (state, k, blocks, input)) + +DEFINE_FAT_FUNC(_nettle_sha512_compress, void, + (uint64_t *state, const uint8_t *input, const uint64_t *k), + (state, input, k)) diff --git a/powerpc64/fat/sha512-compress-2.asm b/powerpc64/fat/sha512-compress-2.asm new file mode 100644 index ..9445e5ba --- /dev/null +++ b/powerpc64/fat/sha512-compress-2.asm @@ -0,0 +1,36 @@ +C powerpc64/fat/sha512-compress-2.asm + +ifelse(` + Copyright (C) 2024 Eric Richter, IBM Corporation + + This file is part of GNU Nettle. + + GNU Nettle is free software: you can redistribute it and/or + modify it under the terms of either: + + * the GNU Lesser General Public License as published by the Free + Software Foundation; either version 3 of the License, or (at your + option) any later version. + + or + + * the GNU General Public License as published by the Free + Software Foundation; either version 2 of the License, or (at your + option) any later version. + + or both in parallel, as here. + + GNU Nettle is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received copies of the GNU General Public License and + the GNU Lesser General Public License along with this program. If + not, see http://www.gnu.org/licenses/. +') + +dnl PROLOGUE(_nettle_sha512_compress) picked up by configure + +define(`fat_transform', `$1_ppc64') +include_src(`powerpc64/p8/sha512-compress.asm') diff --git a/powerpc64/p8/sha512-compress.asm b/powerpc64/p8/sha512-compress.asm new file mode 100644 index ..36dd011c --- /dev/null +++ b/powerpc64/p8/sha512-compress.asm @@ -0,0 +1,345 @@ +C x86_64/sha512-compress.asm + +ifelse(` + Copyright (C) 2024 Eric Richter, IBM Corporation + + This file is part of GNU Nettle. + + GNU Nettle is free software: you can redistribute it and/or + modify it under the terms of either: + + * the GNU Lesser General Public License as published by the Free + Software Foundation; either version 3 of the License, or (at your + option) any later version. + + or + + * the GNU General
Re: additional API for SHAKE streaming read
Niels Möller writes: > Niels Möller writes: > >> I'll try to clean up and post or commit some of my changes, I'm sorry >> that will cause some conflicts. > > I've pushed my changes to a branch sha3-shake-updates, does that look > reasonable to you? If so, I think the next steps are Yes, that looks good to me, except _nettle_sha3_shake has a copy-and-paste error where SHA3_256_BLOCK_SIZE is hard-coded. > 1. Decide what should be renamed sha3_shake256_* I guess we can live with the existing interface. For SHAKE128, we could only provide sha3_128_init, sha3_128_update, and sha3_128_shake{,_output}, without sha3_128_digest. > 2. Implement shake128. I've extracted it from the ML-KEM merge request and put it here: https://git.lysator.liu.se/nettle/nettle/-/merge_requests/63 Not sending via email as it includes a huge test vector. > 3. Update docs. I can do that once we settle the interface. Regards, -- Daiki Ueno ___ nettle-bugs mailing list -- nettle-bugs@lists.lysator.liu.se To unsubscribe send an email to nettle-bugs-le...@lists.lysator.liu.se