On Thu, Sep 24, 2020 at 3:46 PM Niels Möller <[email protected]> wrote: > > I'm trying to learn a bit of ppc assembly. Below is an implementation of > _chacha_core. Seems to work, when tested on gcc112.fsffrance.org (just > put the file in the powerpc64 directory and reconfigure). This machine > is little-endian, I haven't yet tested on big-endian. > > Unfortunately I don't get any accurate benchmark numbers on that > machine, but I think speedup may be on the order of 50%...
Yeah, getting accurate benchmark results is difficult on the compile farm. First, you need to moves the machines into performance mode but you can't because you're not an admin. (A script like https://github.com/weidai11/cryptopp/blob/master/TestScripts/governor.sh will do if you are admin). Second, the ISA seems to produce random looking benchmark results. I've never been able to identify good access patterns to produce consistent results. Part of this problem may be powersave mode. Part of it may be mistakes on my part. Third, to develop somewhat consistent benchmark statistics, repeat the benchmark several times and discard the outliers. I discard both low- and high-outliers. (The low- outliers may be valid, but I discard them anyway). Also see "GCC135/Power9 performance?", https://lists.tetaneutral.net/pipermail/cfarm-users/2020-April/000556.html. Andy Polyakov joins the conversation and provides his insights. Jeff _______________________________________________ nettle-bugs mailing list [email protected] http://lists.lysator.liu.se/mailman/listinfo/nettle-bugs
