On Fri, 28 Oct 2022 20:58:33 GMT, Volodymyr Paprotski <d...@openjdk.org> wrote:
>> No, going the WhiteBox route was not something I was thinking of. I sought >> feedback from a couple hotspot-knowledgable people about the use of WhiteBox >> APIs and both felt that it was not the right way to go. One said that >> WhiteBox is really for VM testing and not for these kinds of java classes. > > One idea I was trying to measure was to make the intrinsic (i.e. the while > loop remains exactly the same, just moved to different =non-static= function): > > private void processMultipleBlocks(byte[] input, int offset, int length) { > //, MutableIntegerModuloP A, IntegerModuloP R) { > while (length >= BLOCK_LENGTH) { > n.setValue(input, offset, BLOCK_LENGTH, (byte)0x01); > a.setSum(n); // A += (temp | 0x01) > a.setProduct(r); // A = (A * R) % p > offset += BLOCK_LENGTH; > length -= BLOCK_LENGTH; > } > } > > > In principle, the java version would not get any slower (i.e. there is only > one extra function jump). At the expense of the C++ glue getting more > complex. In C++ I need to dig out using IR > `(sun.security.util.math.intpoly.IntegerPolynomial.MutableElement)(this.a).limbs` > then convert 5*26bit limbs into 3*44-bit limbs. The IR is very new to me so > will take some time. (I think I found some AES code that does something > similar). > > That said.. I thought this idea would had been perhaps a separate PR, if > needed at all.. Digging limbs out is one thing, but also need to add asserts > and safety. Mostly would be happy to just measure if its worth it. thread resumed below ------------- PR: https://git.openjdk.org/jdk/pull/10582