Thanks for the clarification, I just misunderstanded the division with the
partial reduction in a previous reply.

Ok, so you mean a polynomial division of  b_0(x) by P(x) where P(x) = X^128
+  X^127 +  X^126 +  X^121 + 1
b_0(x)/P(x) = (b_0(x)*(p^-1 mod P(x))) mod P(x)
b_0(x)/P(x) = (b_0(x)*(p')) mod P(x)
P(x) = X^64 +  X^63 +  X^62 +  X^57
P' = p^-1 mod P(x) = X^63 +  X^62 +  X^57
so the constant 0xC2

let me show you part of the new implementation of _nettle_gcm_init_key in
PPC

    C --- calculate [H = H << 1 modulo polynomial] ---

    vupkhsb        EMSB,H                        C extend most significant
bit to first byte
    vspltisb       B1,1                          C
0x01010101010101010101010101010101
    vspltb         EMSB,EMSB,0                   C first byte
quadword-extend
    vsl            H,H,B1                        C H = H << 1
    vand           EMSB,EMSB,POLY                C EMSB &=
0xC2000000000000000000000000000001
    vxor           ZERO,ZERO,ZERO                C
0x00000000000000000000000000000000
    vxor           H,H,EMSB                      C H ^= EMSB

    xxmrghd        VSR(POLY_L),VSR(ZERO),VSR(POLY) C
0x0000000000000000C200000000000000
    xxmrghd        VSR(POLY_H),VSR(POLY),VSR(ZERO) C
0xC2000000000000000000000000000000

    C --- calculate [H^2 = H*H] ---

    xxswapd        VSR(Hm),VSR(H)
    vpmsumd        HP,H,POLY_L
    vxor           L,HP,Hm
    xxmrghd        VSR(M),VSR(H),VSR(HP)
    xxmrgld        VSR(L),VSR(H),VSR(L)

    vpmsumd        R,M,H
    vpmsumd        F,L,H

    vpmsumd        T,F,POLY_H
    xxswapd        VSR(F),VSR(F)
    vxor           R,R,T
    vxor           R,R,F

R still yields the wrong value of H^2, did I miss something in the
implementation or did it wrong?

On Sun, Oct 11, 2020 at 8:14 PM Niels Möller <[email protected]> wrote:

> Maamoun TK <[email protected]> writes:
>
> > Hi Niels,
> >
> > I tried to apply your method but can't get it work,
>
> Hmm, do you think I've missed something in the math, or are there other
> difficulties?
>
> > while applying it one
> > question came to my mind.
> >
> >
> >> First, compute b_0(x) / x^64 (mod P(x)), which expands it from 64 bits
> to
> >> 128,
> >>
> >>   c_1(x) x^64 + c_0(x) = b_0(x) / x^64 (mod P(x))
> >>
> >
> > Here you are trying to get partially reduced product by computing b_0(x)
> /
> > x^64 (mod P(x)) but since the degree of input is 127,  we can use the
> > polynomial defining the finite field with x^64 elements, in this case
> P(x)
> > = X^64+X^4+X^3+X+1 and P' = P^-1 (mod X^64) = X^63+X^61+X^60+1 which is
> the
> > same constant 0xB0 and the function now: c_1(x) x^64 + c_0(x) = ((b_0 mod
> > X^64) * p') mod X^64
>
> For correctness, I think it is important that the computation b_0(x) /
> x^64 is done modulo the gcm polynomial (originally, x^128 + x^7 + x^2 +
> x + 1, but after bit reflection, P(x) = x^128 + x^127 + x^126 + x^125 +
> 1).
>
> I don't see how one can do part of the computation in GF(2^64), or how
> your degree-64 polynomial relates the the original degree-128
> polynomial. If there's some useful embedding of GF(2^64) as a subfield
> of GF(2^128), please explain?
>
> That said, division by x^64 is fairly cheap, since P(x) = 1 (mod x^64).
> I think we get
>
>   b_0(x) / x^64 (mod P(x)) = b_0(x) (1 + P(x)) / x^64 (mod P(x)
>
> where we can simplify (P(x) + 1) / x^64 to x^64 + x^63 + x^62 + x^58, or
>
>   b_0(x) / x^64 (mod P(x)) = b_0(x) (x^64 to x^64 + x^63 + x^62 + x^58)
>
> So no reduction needed, just split the product in high and low part to
> get c_1 and c_0.
>
> Regards,
> /Niels
>
> --
> Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
> Internet email is subject to wholesale government surveillance.
>
_______________________________________________
nettle-bugs mailing list
[email protected]
http://lists.lysator.liu.se/mailman/listinfo/nettle-bugs

Reply via email to