On Thu, 22 Sep 2022 20:40:08 GMT, Xue-Lei Andrew Fan <xue...@openjdk.org> wrote:
> Hi, > > Please review this performance improvement for Secp256R1 implementation in > OpenJDK. With this update, there is an about 20% performance improvement for > Secp256R1 key generation and signature. > > Basically, 256 bits EC curves could use 9 integer limbs for the computation. > The current implementation use 10 limbs instead. By reducing the number of > limbs, the implementation could benefit from less integer computation > (add/sub/multiply/square/inverse/mod/pow, etc), and thus improve the > performance. > > Here are the benchmark numbers without the patch: > > Benchmark (messageLength) Mode Cnt Score Error Units > Signatures.sign 64 thrpt 15 1.414 ± 0.022 ops/ms > Signatures.sign 512 thrpt 15 1.418 ± 0.004 ops/ms > Signatures.sign 2048 thrpt 15 1.419 ± 0.005 ops/ms > Signatures.sign 16384 thrpt 15 1.395 ± 0.003 ops/ms > > KeyGenerators.keyPairGen thrpt 15 1.475 ± 0.043 ops/ms > > > And here are the numbers with the patch applied: > > Benchmark (messageLength) Mode Cnt Score Error Units > ECSignature.sign 64 thrpt 15 1.719 ± 0.010 ops/ms > ECSignature.sign 512 thrpt 15 1.704 ± 0.012 ops/ms > ECSignature.sign 2048 thrpt 15 1.699 ± 0.018 ops/ms > ECSignature.sign 16384 thrpt 15 1.681 ± 0.006 ops/ms > > KeyGenerators.keyPairGen thrpt 15 1.881 ± 0.008 ops/ms > > > Thanks, > Xuelei > > Limb values will always fit within a long, so inputs to multiplication must > > be less than 32 bits. _All IntegerPolynomial implementations allow at most > > one addition before multiplication_. Additions after that will result in an > > ArithmeticException. > > The highlighted part of the comment is incorrect; @djelinski I got your point now. The scalar multiplication is carefully coded so that at most 2 additions are allowed. If 2+ is required, reducing is used in the code (see the use of setReduced() in ECOperations). The use of maxAdd may be fragile unless more checking get introduced to detect issues like integer overflow. Alternatively, the additions implementation could be updated to take care of the overflow internally. I will see if there is solution that is simple and effective. Thanks! ------------- PR: https://git.openjdk.org/jdk/pull/10398