I am looking forward to intrinsic support for 128 bit math using ?Long2? and XMM (or even YMM, ZMM) instructions. This is the best way forward, I hope.
Personally I would like to see a long long type, or even uint128, uint256, uint512 style notation. Another option might be something like long<128> or an annotation like @uint128 long or even @decimal128 double but who knows. Regards, Peter. ᐧ On 25 September 2017 at 18:48, Andrew Haley <a...@redhat.com> wrote: > On 25/09/17 18:21, Adam Petcher wrote: > > I agree that an unsigned multiplyHigh would be useful for crypto > > purposes, and we should consider adding it. Of course, I would much > > rather have multiply operations that return both 64-bit parts of the > > result, but that is going to be hard to do well without value types. So > > it would be nice to have something like this in the meantime. > > I take your point, but it won't be excruciatingly difficult for the C2 > compiler to turn the multiply operations into a single one, if the CPU > can do that. From what I've seen recently, though, on non-x86 it's > common for the two halves of the result to be calculated by separate > instructions. > > > If we are going to add this operation, it should probably be added > > along with an intrinsic. I think the Java code can simply factor out > > the else branch from the existing multiplyHigh code. This way, > > unsignedMultiplyHigh will be at least as fast as multiplyHigh, > > whether the intrinsic implementation is available or not. > > Sure. I can do that. > > > If possible, the implementation of this operation should not branch on > > either operand. This would make it more widely useful for constant-time > > crypto implementations. Though this property would need to go into the > > spec in order for constant-time crypto code to use this method, and I > > don't know how reasonable it is to put something like this in the spec. > > OK. I can do it so that there are no branches in the Java. The Java > code for signed multiplyHigh has some data-dependent branches in an > attempt to speed it up, though. I don't know how effective they are, > and I could have a look at taking them out. > > > Side note: at the moment, I am using signed arithmetic in prototypes for > > Poly1305, X25519, and EdDSA, partially due to lack of support for > > unsigned operations like this one. I don't think having > > unsignedMultiplyHigh would, on its own, convince me to use an unsigned > > representation, but the forces are different for each > > algorithm/implementation. > > Sure. I don't think it really matters from a performance point of > view which you use, given intrinsics for both. > > -- > Andrew Haley > Java Platform Lead Engineer > Red Hat UK Ltd. <https://www.redhat.com> > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 >