Re: Release of Nettle-3.7?

2021-01-13 Thread Michael Weiser
Hello Niels, On Wed, Jan 13, 2021 at 01:43:38PM +0100, Niels Möller wrote: > > Attached is the new patch that unconditionally switches from vldm to > > vld1.32 but > > keeps vstm in favour of vst1.8 on little-endian for stores. > Thanks! Applied now. Perfect! Incidentally: The other day I was

Old ARM Neon code for salsa20 and chacha (was: Re: Release of Nettle-3.7?)

2021-01-13 Thread Niels Möller
ni...@lysator.liu.se (Niels Möller) writes: > I've done a benchmark run of nettle-3.6 on the GMP "nanot2" system, with > a Cortex-A9 processor. The installed compiler is gcc-5.4 (a few years > old). I choose Cortex-A9 for this test in attempt to reproduce my old numbers. Even if it's probably

Re: Release of Nettle-3.7?

2021-01-13 Thread Niels Möller
Michael Weiser writes: > Attached is the new patch that unconditionally switches from vldm to vld1.32 > but > keeps vstm in favour of vst1.8 on little-endian for stores. Thanks! Applied now. > From that point of view, the slight performance hit for vld1.32 but > keeping of vstm on LE seems

Re: Release of Nettle-3.7?

2021-01-03 Thread Michael Weiser
Hello Niels, On Fri, Jan 01, 2021 at 06:07:14PM +0100, Niels Möller wrote: > > With the help of Jeff I've gone on a bit of a benchmark binge using a: > > > > - Raspberry Pi 1B (Broadcom BCM2835, arm11), > > - Cubieboard2 (Allwinner A20, Cortex-A7), > > - Wandboard (Freescale i.MX6 DualLite,

Re: Release of Nettle-3.7?

2021-01-02 Thread Niels Möller
I've made a release candidate tarball, see http://www.lysator.liu.se/~nisse/archive/nettle-3.7rc1.tar.gz Intend to release in a day or two. I mostly trust the ci system, so I will do only a few tests on the tarball to try to catch any packaging mistakes. As usual, any additional testing highly

Re: Release of Nettle-3.7?

2021-01-01 Thread Niels Möller
ni...@lysator.liu.se (Niels Möller) writes: > Thanks for investigating. So from these charts, it looks like the > single-block Neon code is of no benefit on any of the test systems. And > even significantly slower on the tinkerboard and rpi4. > > If that's right, the code should probably just be

Re: Release of Nettle-3.7?

2021-01-01 Thread Niels Möller
Michael Weiser writes: > Happy new year, Niels and all around, > > On Wed, Dec 30, 2020 at 09:12:24PM +0100, Niels Möller wrote: > >> > It comes out at around seven cycles per block slowdown for chacha-3core >> > and five for salsa20-2core. I trace this to vst1.8. It's just slower >> Thanks for

Re: Release of Nettle-3.7?

2021-01-01 Thread Michael Weiser
Happy new year, Niels and all around, On Wed, Dec 30, 2020 at 09:12:24PM +0100, Niels Möller wrote: > > It comes out at around seven cycles per block slowdown for chacha-3core > > and five for salsa20-2core. I trace this to vst1.8. It's just slower > Thanks for investigating. Maybe keep some

Re: Release of Nettle-3.7?

2020-12-30 Thread Jeffrey Walton
On Tue, Dec 29, 2020 at 5:15 PM Michael Weiser wrote: > > ... > Do you (or anybody else) have a hardware arm board for testing, possibly > with a Cortex A8 or A9 implementation to see how it behaves there? I've got a Wnadboard/Cortex-A9 and Tinkerboard/Cortex-A17 hanging off the internet with

Re: Release of Nettle-3.7?

2020-12-30 Thread Niels Möller
Michael Weiser writes: > It comes out at around seven cycles per block slowdown for chacha-3core > and five for salsa20-2core. I trace this to vst1.8. It's just slower > than vstm (in contrast to vldm vs. vld1.32). I managed to save a > cumulative two cycles by rescheduling instructions so that

Re: Release of Nettle-3.7?

2020-12-29 Thread Michael Weiser
Hello Niels, On Fri, Dec 25, 2020 at 10:48:19PM +0100, Niels Möller wrote: > Since we have plenty of registers available, (including r3 which seems > unused and free to clobber), I'd suggest using >define(`SRCp32', `r3') > and an >add SRCp32, SRC, #32 > in function entry, and then

Re: Make --enable-fat the default? (was: Re: Release of Nettle-3.7?)

2020-12-26 Thread Maamoun TK
Since there are many variants of architectures, some are supported and others could be supported in the future, it becomes a little annoying for end-users to browse the configurable options and enable specific options to get maximum speed for corresponding algorithms so here --enable-fat comes in

Make --enable-fat the default? (was: Re: Release of Nettle-3.7?)

2020-12-26 Thread Niels Möller
ni...@lysator.liu.se (Niels Möller) writes: > Hi, I wonder if it would make sense to try to cut a release pretty soon > (and without any arm64 changes)? Previous release was made end of April, > and there's been quite a few improvements since then. I've pushed a couple of changes to increase

Re: Release of Nettle-3.7?

2020-12-25 Thread Niels Möller
Michael Weiser writes: > Longer story for completeness: It seems I ran afoul gdb's way of > displaying registers in memory endianness again. I knew all this once > already.[1] I should likely do this more often than every couple of > years. ;) I'm always confused by the conventions for ordering

Re: Release of Nettle-3.7?

2020-12-25 Thread Michael Weiser
Hello Niels, On Mon, Dec 21, 2020 at 09:16:25PM +0100, Niels Möller wrote: > What's the layout before the transpose, immediately after load? I'd > guess you get X1: 1 0 3 2? TL;DR: Yes, it is. I abandoned this approach for now though, since I found some options to eliminate the word

Re: Release of Nettle-3.7?

2020-12-21 Thread Niels Möller
Michael Weiser writes: > See the attached patch for my current approach to fixing it, which is > explicit transposing, adding and then transposing again to be as > transposed as the other operands. I haven't yet read the code, but I have some comments based on your description only. > I

Re: Release of Nettle-3.7?

2020-12-21 Thread Michael Weiser
Hello Niels, On Sat, Dec 19, 2020 at 09:51:45AM +0100, Niels Möller wrote: > > Porting over the basic > > IF_[LB]E mechanism from chacha-core-internal was easy and fixed up the > > first of the three interleaved blocks right away. For the other two I am > > still in the process of wrapping my

Re: Release of Nettle-3.7?

2020-12-19 Thread Niels Möller
Jeffrey Walton writes: > Also see > https://www.gnu.org/software/libtool/manual/html_node/Updating-version-info.html. It's not entirely clear to me how libtool versions maps to soname, but from looking at GMP, I guess the number embedded in the soname is current - age. So for gmp-6.1.2, the

Re: Release of Nettle-3.7?

2020-12-19 Thread Andreas Metzler
On 2020-12-19 Niels Möller wrote: > Amos Jeffries writes: > > I would have though this needs a soname bump. Otherwise software built > > to use bcrypt might try to link to the old version with same soname. > My understanding is that one usually doesn't bump the soname when adding > new

Re: Release of Nettle-3.7?

2020-12-19 Thread Jeffrey Walton
On Sat, Dec 19, 2020 at 4:44 AM Niels Möller wrote: > > Amos Jeffries writes: > > > I would have though this needs a soname bump. Otherwise software built > > to use bcrypt might try to link to the old version with same soname. > > My understanding is that one usually doesn't bump the soname

Re: Release of Nettle-3.7?

2020-12-19 Thread Niels Möller
Amos Jeffries writes: > I would have though this needs a soname bump. Otherwise software built > to use bcrypt might try to link to the old version with same soname. My understanding is that one usually doesn't bump the soname when adding new functions. I was trying to look at how it has been

Re: Release of Nettle-3.7?

2020-12-19 Thread Amos Jeffries
On 19/12/20 5:29 am, Niels Möller wrote: Andreas Metzler writes: it would not count as transition https://release.debian.org/bullseye/freeze_policy.html#transition ... * Support for bcrypt, contributed by Stephen R. van den Berg. I would have though this needs a soname bump.

Re: Release of Nettle-3.7?

2020-12-19 Thread Niels Möller
Michael Weiser writes: > Porting over the basic > IF_[LB]E mechanism from chacha-core-internal was easy and fixed up the > first of the three interleaved blocks right away. For the other two I am > still in the process of wrapping my head around how the interleaving > works and how it would need

Re: Release of Nettle-3.7?

2020-12-19 Thread Maamoun TK
Hi Michael, On Fri, Dec 18, 2020 at 8:00 PM Michael Weiser wrote: > qemu-user works nicely for aarch64_be. I used it to semi-natively > compile a whole aarch64 userland. I could dust off pine64 board that is > running that userland now for real-world testing if you like. > Thank you, I will

Re: Release of Nettle-3.7?

2020-12-18 Thread Michael Weiser
Hi Niels and Maamoun, On Fri, Dec 18, 2020 at 07:18:24PM +0200, Maamoun TK wrote: > > One problem with the current state is that big-endian arm is most likely > > broken. I don't want to delay the release for that though, since I'm not > > able to fix it. If anyone is able to test and fix, soon

Re: Release of Nettle-3.7?

2020-12-18 Thread Maamoun TK
> > I think the powerpc64 code is in good shape now, and ready for release. Are > you aware of anything that needs fixing? > No, all what I can think of are a couple of issues that make the powerpc64 code more stable (you can check their merge requests in the repo). > One problem with the

Re: Release of Nettle-3.7?

2020-12-18 Thread Niels Möller
. Regards, /Niels NEWS for the Nettle 3.7 release This release adds one new feature, the bcrypt password hashing function, and lots of optimizations. The release adds PowerPC64 assembly for a few algorithms, resulting in great speedups. Benchmarked on a Power9 machine, spee

Re: Release of Nettle-3.7?

2020-12-15 Thread Andreas Metzler
On 2020-12-15 Niels Möller wrote: > Hi, I wonder if it would make sense to try to cut a release pretty soon > (and without any arm64 changes)? Previous release was made end of April, > and there's been quite a few improvements since then. > I wonder if it is possible to make a release in time for

Release of Nettle-3.7?

2020-12-15 Thread Niels Möller
Hi, I wonder if it would make sense to try to cut a release pretty soon (and without any arm64 changes)? Previous release was made end of April, and there's been quite a few improvements since then. I wonder if it is possible to make a release in time for the upcoming debian release?