On Wed, 5 Oct 2022 17:37:25 GMT, Xue-Lei Andrew Fan <xue...@openjdk.org> wrote:
>> Hi, >> >> May I have this patch reviewed? >> >> This is one of a few steps to improve the EC performance. The multiplicative >> inverse implementation could be improved for better performance. >> >> For secp256r1 prime p, the current multiplicative inverse impl needs 256 >> square and 128 multiplication. With the path, the operation needs 256 >> square and 37 multiplication. >> >> For secp256r1 order n, the current multiplicative inverse impl needs 256 >> square and 169 multiplication. With the patch, the operation needs 256 >> square and 94 multiplication. >> >> Here is the benchmark numbers before the patch applied: >> >> Benchmark (messageLength) Mode Cnt Score Error Units >> Signatures.sign 64 thrpt 15 1412.644 ± 5.529 ops/s >> Signatures.sign 512 thrpt 15 1407.711 ± 14.118 ops/s >> Signatures.sign 2048 thrpt 15 1415.674 ± 6.965 ops/s >> Signatures.sign 16384 thrpt 15 1395.582 ± 12.689 ops/s >> >> >> And the following are the benchmarking after the patch applied. >> >> Benchmark (messageLength) Mode Cnt Score Error Units >> Signatures.sign 64 thrpt 15 1459.527 ± 10.160 ops/s >> Signatures.sign 512 thrpt 15 1450.707 ± 15.554 ops/s >> Signatures.sign 2048 thrpt 15 1453.490 ± 5.362 ops/s >> Signatures.sign 16384 thrpt 15 1437.364 ± 8.291 ops/s >> >> >> For comparing, here is the benchmarking numbers by using >> BigInteger.modInverse(); >> >> Benchmark (messageLength) Mode Cnt Score Error Units >> Signatures.sign 64 thrpt 15 1395.628 ± 180.649 ops/s >> Signatures.sign 512 thrpt 15 1510.590 ± 9.826 ops/s >> Signatures.sign 2048 thrpt 15 1514.282 ± 3.382 ops/s >> Signatures.sign 16384 thrpt 15 1497.325 ± 6.854 ops/s >> >> and numbers for using BigInteger.modPow(): >> >> Benchmark (messageLength) Mode Cnt Score Error Units >> Signatures.sign 64 thrpt 15 1486.764 ± 17.908 ops/s >> Signatures.sign 512 thrpt 15 1494.801 ± 14.072 ops/s >> Signatures.sign 2048 thrpt 15 1500.170 ± 6.998 ops/s >> Signatures.sign 16384 thrpt 15 1434.192 ± 49.269 ops/s >> >> >> >> It looks like the performance improvement is no significant enough for now. >> But it may be 2+ times more in numbers when the scalar multiplication >> implementation is improved in a follow-up enhancement in another pull >> request. >> >> Enhancement for other curves will be considered in separate pull requests. >> >> Thanks, >> Xuelei > > Xue-Lei Andrew Fan has updated the pull request incrementally with one > additional commit since the last revision: > > more performance improvement BigInteger exponentiation time also depends on also depends on the base; quick benchmark: `BigInteger.ONE.modPow(mod.subtract(BigInteger.TWO), mod)` vs `BigInteger.TWO.modPow(mod.subtract(BigInteger.TWO), mod)`: Benchmark (messageLength) Mode Cnt Score Error Units Signatures.pow1 64 thrpt 15 67352286,115 ± 1281517,907 ops/s Signatures.pow2 64 thrpt 15 62431,716 ± 1056,398 ops/s for IntegerModuloP the result should not depend on base, and if it does, we should fix that. ------------- PR: https://git.openjdk.org/jdk/pull/10544