>>> My understand after talking with Vlad that the "sbb \$0, $acc2" makes >>> this equivalent to (r >= 2**256) ? (r - q) : r. If the "sbb \$0, >>> $acc2" line were removed then it would be equivalent to (r >= q) ? (r >>> - q) : r. My understanding is that the difference in semantics is >>> exactly the difference between partially reduced results and fully >>> reduced results. >> >> Let's recall that result of multiplication prior final reduction is >> actually n+1-limb value, with +1 limb being single bit, that's $acc2, >> 5th limb in the context. So that the $0 in last 'sbb \$0,$acc2' >> represents 5th ("imaginary") limb of modulus[!]. And since we're looking >> at borrow from this 5-limb subtraction, outcome is actually >> >> if (ret > P) ret -= P' > > OK, let's agree on that.
Actually correct expression is 'if (ret >= P) ret -= P'. > I think you are assuming that ret is in the range [0, 2P), so that if > you subtract P, the result would be in the range [0, P). That is the > case in normal Montgomery multiplication, where the inputs are in the > range [0, P). But, my understanding is that if the inputs are in the > range [P, 2**256), e.g. they are the result of ecp_nistz256_add, then > that assumption doesn't necessarily hold. Looks like you are right. I mean it indeed appears to be possible for multiplication (and squaring) subroutine to return partially reduced result. But *only* if input was partially reduced. I.e. if input is fully reduced, the output *shall* be too. And if input is not fully reduced, then output *can* be. And condition for "can" appears somewhat tricky. If not fully reduced input was force-reduced and final condition in 'if (ret >= P) ret -= P' was true, then output would be not fully reduced. > I understand Almost Montgomery Multiplication (AMM) as described in > [1], where one accepts inputs in the range [0, 2**n] and returns a > result in the range [0, 2**n), not [0, P). And that one checks only the top bit. This means that partially reduced output is possible even for fully reduced input. > I also read the original paper on the ecp_nistz256 implementation [2], > which has "0 ≤ a, b < p" as a precondition for the Montgomery > multiplication. > > To put my concern another way: Earlier in the thread, you said the > function can take inputs that aren't fully reduced--i.e. are in in the > range [0, 2**n)--and returns outputs that are fully reduced--i.e. in > the range [0, P) I was wrong, sorry about confusion. -- openssl-dev mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev