Maamoun TK <[email protected]> writes:
> Thanks for the clarification, I just misunderstanded the division with the
> partial reduction in a previous reply.
>
> Ok, so you mean a polynomial division of b_0(x) by P(x) where P(x) = X^128
> + X^127 + X^126 + X^121 + 1
> b_0(x)/P(x) = (b_0(x)*(p^-1 mod P(x))) mod P(x)
> b_0(x)/P(x) = (b_0(x)*(p')) mod P(x)
> P(x) = X^64 + X^63 + X^62 + X^57
> P' = p^-1 mod P(x) = X^63 + X^62 + X^57
> so the constant 0xC2
Hmm, I'm not following you (and it seems you are using P(x) to refer to
two different polynomials). The operation is a bit similar to Montgomery
redc, and it has taken me some time to get used to the initially strange
mix of operations mod P(x) and mod x^k (or in the integer case, it's a
mix of mod m, with m odd, and mod 2^k).
b_0(x) / x^64 (mod P(x))
can be computed in two different ways, giving the same result:
(i) Compute the inverse u(x) = (x^64){-1} mod P(x), then multiply
b_0(x) u(x) (mod P(x))
We get u(x) = x^64 + x^63 + x^62 + x^57, with degree only 64.
(This is thanks to the special structure of P(x); an inverse of
some arbitrary element mod a 128-degree polynomial would be
expected to be larger).
(ii) Add a multiple of P(x) to b_0(x) to make the the lowest 64
coefficients all cancel, then divide by x^64 by simply shifting /
subtracting 64 from exponents of remaining terms. And because of
the special structure of P(x), P(x) = 1 mod x^64, the right
multiple is b_0(x) P(x). So
b_0(x) + P(x) b_0(x)
is a degree 191 polynomial, with the least 64 coefficients all
zero. When simplified, it boils down to the same b_0(x) u(x), no
additional reduction needed.
I tend to think about reductions (from either end) in terms of
cancelling bits, so to me, (ii) is a familiar way to think about it.
> let me show you part of the new implementation of _nettle_gcm_init_key in
> PPC
>
> C --- calculate [H = H << 1 modulo polynomial] ---
>
> vupkhsb EMSB,H C extend most significant
> bit to first byte
> vspltisb B1,1 C
> 0x01010101010101010101010101010101
> vspltb EMSB,EMSB,0 C first byte
> quadword-extend
> vsl H,H,B1 C H = H << 1
> vand EMSB,EMSB,POLY C EMSB &=
> 0xC2000000000000000000000000000001
> vxor ZERO,ZERO,ZERO C
> 0x00000000000000000000000000000000
> vxor H,H,EMSB C H ^= EMSB
>
> xxmrghd VSR(POLY_L),VSR(ZERO),VSR(POLY) C
> 0x0000000000000000C200000000000000
> xxmrghd VSR(POLY_H),VSR(POLY),VSR(ZERO) C
> 0xC2000000000000000000000000000000
>
> C --- calculate [H^2 = H*H] ---
>
> xxswapd VSR(Hm),VSR(H)
> vpmsumd HP,H,POLY_L
> vxor L,HP,Hm
> xxmrghd VSR(M),VSR(H),VSR(HP)
> xxmrgld VSR(L),VSR(H),VSR(L)
>
> vpmsumd R,M,H
> vpmsumd F,L,H
>
> vpmsumd T,F,POLY_H
> xxswapd VSR(F),VSR(F)
> vxor R,R,T
> vxor R,R,F
>
> R still yields the wrong value of H^2, did I miss something in the
> implementation or did it wrong?
I'm afraid I don't follow all details (and I would suggest trying to get
the most basic variant working first, doing only a single block at a
time). But one potential issue is the powers of x. If H(x) is the
original key, then I think you need to precompute values corresponding
to
x H(x) (mod P(x))
x H(x)^2 (mod P(x))
But it looks like you may be computing
(x H(x))^2 / x^128 = x^2 H(x)^2 / x^128
which is different. It's easy to get confused, but I think the
precomputation needs a standard mod P(x), i.e., adding a multiple of
P(x) cancelling the most significant coefficients, rather than the more
efficient reduction cancelling the least significant coefficients.
If you're familiar with montgomery multiplication of integers, there
it's possible to consistently use the operation a,b -> ab / 2^k
everywhere, but that requires that all inputs are transformed to
montgomery form, a --> 2^k a mod m. And in this case, it's desirable to
scale the precomputed values appropriately, so no such transformation is
needed for the inputs corresponding to message blocks.
To compute more powers of H, one could do the standard reduction only
once, essentially transforming H to montgomery form,
H' = H(x) x^128 (mod P(x)),
then further powers of H can use the reduction of least significant
coefficients,
(x H(x)) H' / x^128 = x H(x)^2
(x H(x)^2) H' / x^128 = x H(x)^3
etc.
Regards,
/Niels
--
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
_______________________________________________
nettle-bugs mailing list
[email protected]
http://lists.lysator.liu.se/mailman/listinfo/nettle-bugs