On Thu, Sep 24, 2020 at 3:46 PM Niels Möller <[email protected]> wrote:
>
> I'm trying to learn a bit of ppc assembly. Below is an implementation of
> _chacha_core. Seems to work, when tested on gcc112.fsffrance.org (just
> put the file in the powerpc64 directory and reconfigure). This machine
> is little-endian, I haven't yet tested on big-endian.
>
> Unfortunately I don't get any accurate benchmark numbers on that
> machine, but I think speedup may be on the order of 50%...

Yeah, getting accurate benchmark results is difficult on the compile
farm. First, you need to moves the machines into performance mode but
you can't because you're not an admin. (A script like
https://github.com/weidai11/cryptopp/blob/master/TestScripts/governor.sh
will do if you are admin).

Second, the ISA seems to produce random looking benchmark results.
I've never been able to identify good access patterns to produce
consistent results. Part of this problem may be powersave mode. Part
of it may be mistakes on my part.

Third, to develop somewhat consistent benchmark statistics, repeat the
benchmark several times and discard the outliers. I discard both low-
and high-outliers. (The low- outliers may be valid, but I discard them
anyway).

Also see "GCC135/Power9 performance?",
https://lists.tetaneutral.net/pipermail/cfarm-users/2020-April/000556.html.
Andy Polyakov joins the conversation and provides his insights.

Jeff
_______________________________________________
nettle-bugs mailing list
[email protected]
http://lists.lysator.liu.se/mailman/listinfo/nettle-bugs

Reply via email to