I verified with the authors of the nistz256 code that all the arithmetic is done mod P256 (P), but the results might be unreduced if they are less than 2**256. Thus, all the arithmetic must handle the cases where its inputs are in the range [0, 2**256 - 1], not just [0, P). And, it must deal with the cases where some values have multiple representations. For example, the value 0 can be represented as either 0 or P + 0, 1 can be represented either as 1 or as P + 1, etc.
The nistz256 code contains code that looks like this: if (is_equal(U1, U2) && !in1infty && !in2infty) { if (is_equal(S1, S2)) { ... } } This code only works for values that have one representation; it doesn't work for all values. For example, consider U1 == 1, U2 = P + 1. Then, is_equal(U1, U2) returns false, but in fact U1 == U2 (mod P). Thus, we should run the code for the exceptional case (doubling or setting to the point at infinity), but actually we just keep going with a normal addition. You can see that the code in ecp_nistp256 and ecp_nistp224 such as `is_zero` handles this correctly by considering both representations of 0: 0 and P + 0. To fix this, a constant-time conditional subtraction of P should be done on U1 and U2, and a conditional subtraction should also be done on S1 and S2. I discussed this with some other people familiar with the code and they agree that this seems to be a good way to do it. Note again that it is possible for the value 0 to be represented as either 0 or as P + 0. This brings into question whether is_zero is correct, because it doesn't consider P to be zero. Here there was some disagreement about whether it is necessary to check for P. I personally think that it is safer to check for both 0 and P like the nistp256 code does, because I it is hard to make a proof that the values cannot be P. Not also again that the value 1 can be represented as 1 or as P + 1. Thus, the statement Z_is_one = is_one(...); appears to also be wrong, or at least not obviously correct. Finally, as I mentioned on the mailing list, it seems the function is_zero is missing a comparison of the last limb in the 32-bit case. Unfortunately, I don't have any test vectors for triggering any bugs that would serve as a basis of regression tests; these were found from reading the code. Cheers, Brian -- https://briansmith.org/ -- Ticket here: http://rt.openssl.org/Ticket/Display.html?id=4621 Please log in as guest with password guest if prompted -- openssl-dev mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev