> ... I re-read the code for the conditional subtraction at the > end of ecp_nistz256_mul_mont (__ecp_nistz256_mul_montq, actually) and > I couldn't convince myself that the result was always fully reduced. > > My concern is that what you say and what Vlad said is contradictory. > You both understand the code way better than me, so I feel like I'm > not so useful in resolving the contradiction. But, I will try anyway: > > sbb $poly3, $acc1 # .Lpoly[3] > sbb \$0, $acc2 > > cmovc $t0, $acc4 > cmovc $t1, $acc5 > > My understand after talking with Vlad that the "sbb \$0, $acc2" makes > this equivalent to (r >= 2**256) ? (r - q) : r. If the "sbb \$0, > $acc2" line were removed then it would be equivalent to (r >= q) ? (r > - q) : r. My understanding is that the difference in semantics is > exactly the difference between partially reduced results and fully > reduced results.
Let's recall that result of multiplication prior final reduction is actually n+1-limb value, with +1 limb being single bit, that's $acc2, 5th limb in the context. So that the $0 in last 'sbb \$0,$acc2' represents 5th ("imaginary") limb of modulus[!]. And since we're looking at borrow from this 5-limb subtraction, outcome is actually if (ret > P) ret -= P' Effectively that is. As reality is rather temp = ret; ret -= P; if (borrow, i.e. ret < P) ret = temp For reference, if one wanted to compare result of multiplication to 2^256 it would be sufficient to check for $acc2 being non-zero, but that doesn't actually work. > Another way to see this is that there are 5 sbb instructions I assume that "5 sbb" actually means "1 sub + 4 sbb". > issued > for the conditional subtraction, which means 5 limbs are involved. > But, a full reduction mod q should only have 4 sbb instructions, > right? If you checked for $acc2 being non-zero, i.e. compare to 2^256, chain of four subtraction instructions would suffice, yes. But that's not what we aim for. -- openssl-dev mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev