#4302: [with patch, needs review] improve modular composition in GF(2)[x]
------------------------------+---------------------------------------------
 Reporter:  zimmerma          |        Owner:  somebody
     Type:  task              |       Status:  new     
 Priority:  major             |    Milestone:  sage-3.2
Component:  basic arithmetic  |   Resolution:          
 Keywords:                    |  
------------------------------+---------------------------------------------
Old description:

> Here is a toy implementation of polynomial modular composition over
> GF(2). It implements
> Brent-Kung's Algorithm 2.1 (Fast Algorithms for Manipulation Formal Power
> Series, JACM 1978).
> {{{
> # compute f(g) mod h
> def ModularComposition(f,g,h,k=None):
>     print "enter ModularComposition", f.degree(), g.degree(), h.degree()
>     t = cputime()
>     R = h.parent()
>     n = h.degree()
>     if k is None:
>         k = ceil(Integer(n+1).sqrt_approx())
>     l = ceil((f.degree() + 1) / k)
>     # first compute g^j mod h, 2 <= j < k
>     G = Matrix(GF(2),n,k)
>     gpow = R(1)
>     for j in range(0, k):
>         # gpow = g^j mod h
>         row = gpow.coeffs()
>         if len(row)<n:
>             row.extend([R(0) for _ in range(n - len(row))])
>         G.set_column(j, row)
>         gpow = (gpow * g) % h # we'll need g^k below
>     print "Creating G took", cputime(t)
>     t = cputime()
>     # split f in chunks of degree < k
>     F = Matrix(GF(2),k,l)
>     row = f.coeffs()
>     if len(row)<k*l:
>             row.extend([R(0) for _ in range(k*l - len(row))])
>     for j in range(0, l):
>         F.set_column(j, row[j*k:j*k+k])
>     print "Creating F took", cputime(t)
>     t = cputime()
>     H = G * F # this is the most time-computing step, but M4RI is fast!
>     print "Computing H took", cputime(t)
>     t = cputime()
>     # H is a n x l matrix
>     # now H[i,j] = sum(G[i,m]*F[m,j], m=0..k-1)
>     #            = sum(g^m[i] * f[j*k+m], m=0..k-1)
>     # where g^m[i] is the coefficient of degree i in g^m
>     # and f[j*k+m] is the coefficient of degree j*k+m in f
>     # thus f[j*k+m]*g^m[i] should be multiplied by g^(j*k)
>     # gpow = (g^k) % h
>     x = h.variables()[0]
>     res = R(0)
>     j = l - 1
>     H = H.transpose()
>     while j >= 0:
>         res = (res * gpow) % h
>         # res = res + R([H[j,i] for i in range(0,n)])
>         res = res + R(H.submatrix(j,0,1,n).list())
>         j = j - 1
>     print "Forming result took", cputime(t)
>     sys.stdout.flush()
>     return res
>
> # computes x^(2^r) mod f
> def ModCompPower (f, r):
>     l = r.digits()
>     l.reverse()
>     n = len(l)
>     g = f.variables()[0]
>     for i in range(n):
>         g = ModularComposition(g,g,f)
>         if l[i] == 1:
>            g = (g * g) % f
>     return g
> }}}
> The following benchmark gives on a 2.4Ghz Core 2:
> {{{
> sage: r=1279
> sage: time a = ModCompPower(R(x^r+x+1), r)
> enter ModularComposition 1 1 1279
>    Creating G took 3.948399
>    Creating F took 0.00299900000005
>    Computing H took 0.000999999999976
>    Forming result took 0.0169980000001
> enter ModularComposition 2 2 1279
>    Creating G took 3.896408
>    Creating F took 0.004999
>    Computing H took 0.0
>    Forming result took 0.018997
> enter ModularComposition 4 4 1279
>    Creating G took 3.802422
>    Creating F took 0.00300000000004
>    Computing H took 0.000999999999976
>    Forming result took 0.018997
> enter ModularComposition 16 16 1279
>    Creating G took 3.208512
>    Creating F took 0.00299900000005
>    Computing H took 0.0
>    Forming result took 0.0169979999999
> enter ModularComposition 512 512 1279
>    Creating G took 2.413633
>    Creating F took 0.004999
>    Computing H took 0.00100000000009
>    Forming result took 0.307953
> enter ModularComposition 1202 1202 1279
>    Creating G took 0.895864
>    Creating F took 0.00999899999999
>    Computing H took 0.0
>    Forming result took 0.787879
> enter ModularComposition 1272 1272 1279
>    Creating G took 0.528921
>    Creating F took 0.009997
>    Computing H took 0.000999999999976
>    Forming result took 0.633905
> enter ModularComposition 1275 1275 1279
>    Creating G took 0.474927
>    Creating F took 0.00899800000002
>    Computing H took 0.000999999999976
>    Forming result took 0.630905
> enter ModularComposition 1278 1278 1279
>    Creating G took 0.533918
>    Creating F took 0.00799899999993
>    Computing H took 0.0
>    Forming result took 0.631904
> enter ModularComposition 1277 1277 1279
>    Creating G took 0.482927
>    Creating F took 0.00599899999997
>    Computing H took 0.000999999999976
>    Forming result took 0.609907
> enter ModularComposition 1277 1277 1279
>    Creating G took 0.563913
>    Creating F took 0.00899900000002
>    Computing H took 0.0
>    Forming result took 0.602908
> CPU times: user 24.37 s, sys: 0.79 s, total: 25.16 s
> Wall time: 27.67 s
> }}}
> Several remarks: (a) the time spent in M4RI (Computing H) is negligible;
>                  (b) the time spent in "Creating G" and "Forming result"
> is large,
>                      and is even larger when the inputs have small
> degree!
>                      Something strange happens here.
>
> As a comparison, on the same machine, Magma V2.14-8 takes only 0.02s with
> the following code:
> {{{
> R<x> := PolynomialRing(GF(2));
>
> /* computes x^(2^r) mod f */
> ModCompPower := function(f, r)
>    l := [];
>    t := r;
>    while t ne 0 do
>       l := Append(l, t mod 2);
>       t := t div 2;
>    end while;
>    g := x;
>    for i := #l to 1 by -1 do
>       g := ModularComposition(g,g,f);
>       if l[i] eq 1 then
>          g := Modexp(g,2,f);
>       end if;
>    end for;
>    return g;
> end function;
>
> > r:=1279; time a:=ModCompPower(x^r+x+1, r);
> Time: 0.020
> }}}
> The challenge is to do better than Magma within Sage.

New description:

 Here is a toy implementation of polynomial modular composition over GF(2).
 It implements
 Brent-Kung's Algorithm 2.1 (Fast Algorithms for Manipulation Formal Power
 Series, JACM 1978).
 {{{
 # compute f(g) mod h
 def ModularComposition(f,g,h,k=None):
     print "enter ModularComposition", f.degree(), g.degree(), h.degree()
     t = cputime()
     R = h.parent()
     n = h.degree()
     if k is None:
         k = ceil(Integer(n+1).sqrt_approx())
     l = ceil((f.degree() + 1) / k)
     # first compute g^j mod h, 2 <= j < k
     G = Matrix(GF(2),n,k)
     gpow = R(1)
     for j in range(0, k):
         # gpow = g^j mod h
         row = gpow.coeffs()
         if len(row)<n:
             row.extend([R(0) for _ in range(n - len(row))])
         G.set_column(j, row)
         gpow = (gpow * g) % h # we'll need g^k below
     print "Creating G took", cputime(t)
     t = cputime()
     # split f in chunks of degree < k
     F = Matrix(GF(2),k,l)
     row = f.coeffs()
     if len(row)<k*l:
             row.extend([R(0) for _ in range(k*l - len(row))])
     for j in range(0, l):
         F.set_column(j, row[j*k:j*k+k])
     print "Creating F took", cputime(t)
     t = cputime()
     H = G * F # this is the most time-computing step, but M4RI is fast!
     print "Computing H took", cputime(t)
     t = cputime()
     # H is a n x l matrix
     # now H[i,j] = sum(G[i,m]*F[m,j], m=0..k-1)
     #            = sum(g^m[i] * f[j*k+m], m=0..k-1)
     # where g^m[i] is the coefficient of degree i in g^m
     # and f[j*k+m] is the coefficient of degree j*k+m in f
     # thus f[j*k+m]*g^m[i] should be multiplied by g^(j*k)
     # gpow = (g^k) % h
     x = h.variables()[0]
     res = R(0)
     j = l - 1
     H = H.transpose()
     while j >= 0:
         res = (res * gpow) % h
         # res = res + R([H[j,i] for i in range(0,n)])
         res = res + R(H.submatrix(j,0,1,n).list())
         j = j - 1
     print "Forming result took", cputime(t)
     sys.stdout.flush()
     return res

 # computes x^(2^r) mod f
 def ModCompPower (f, r):
     l = r.digits()
     l.reverse()
     n = len(l)
     g = f.variables()[0]
     for i in range(n):
         g = ModularComposition(g,g,f)
         if l[i] == 1:
            g = (g * g) % f
     return g
 }}}
 The following benchmark gives on a 2.4Ghz Core 2:
 {{{
 sage: r=1279
 sage: time a = ModCompPower(R(x^r+x+1), r)
 enter ModularComposition 1 1 1279
    Creating G took 3.948399
    Creating F took 0.00299900000005
    Computing H took 0.000999999999976
    Forming result took 0.0169980000001
 enter ModularComposition 2 2 1279
    Creating G took 3.896408
    Creating F took 0.004999
    Computing H took 0.0
    Forming result took 0.018997
 enter ModularComposition 4 4 1279
    Creating G took 3.802422
    Creating F took 0.00300000000004
    Computing H took 0.000999999999976
    Forming result took 0.018997
 enter ModularComposition 16 16 1279
    Creating G took 3.208512
    Creating F took 0.00299900000005
    Computing H took 0.0
    Forming result took 0.0169979999999
 enter ModularComposition 512 512 1279
    Creating G took 2.413633
    Creating F took 0.004999
    Computing H took 0.00100000000009
    Forming result took 0.307953
 enter ModularComposition 1202 1202 1279
    Creating G took 0.895864
    Creating F took 0.00999899999999
    Computing H took 0.0
    Forming result took 0.787879
 enter ModularComposition 1272 1272 1279
    Creating G took 0.528921
    Creating F took 0.009997
    Computing H took 0.000999999999976
    Forming result took 0.633905
 enter ModularComposition 1275 1275 1279
    Creating G took 0.474927
    Creating F took 0.00899800000002
    Computing H took 0.000999999999976
    Forming result took 0.630905
 enter ModularComposition 1278 1278 1279
    Creating G took 0.533918
    Creating F took 0.00799899999993
    Computing H took 0.0
    Forming result took 0.631904
 enter ModularComposition 1277 1277 1279
    Creating G took 0.482927
    Creating F took 0.00599899999997
    Computing H took 0.000999999999976
    Forming result took 0.609907
 enter ModularComposition 1277 1277 1279
    Creating G took 0.563913
    Creating F took 0.00899900000002
    Computing H took 0.0
    Forming result took 0.602908
 CPU times: user 24.37 s, sys: 0.79 s, total: 25.16 s
 Wall time: 27.67 s
 }}}
 Several remarks:
  * the time spent in M4RI (Computing H) is negligible;
  * the time spent in "Creating G" and "Forming result" is large,
  and is even larger when the inputs have small degree! Something strange
 happens here.

 As a comparison, on the same machine, Magma V2.14-8 takes only 0.02s with
 the following code:
 {{{
 R<x> := PolynomialRing(GF(2));

 /* computes x^(2^r) mod f */
 ModCompPower := function(f, r)
    l := [];
    t := r;
    while t ne 0 do
       l := Append(l, t mod 2);
       t := t div 2;
    end while;
    g := x;
    for i := #l to 1 by -1 do
       g := ModularComposition(g,g,f);
       if l[i] eq 1 then
          g := Modexp(g,2,f);
       end if;
    end for;
    return g;
 end function;

 > r:=1279; time a:=ModCompPower(x^r+x+1, r);
 Time: 0.020
 }}}
 The challenge is to do better than Magma within Sage.

-- 
Ticket URL: <http://trac.sagemath.org/sage_trac/ticket/4302#comment:13>
Sage <http://sagemath.org/>
Sage - Open Source Mathematical Software: Building the Car Instead of 
Reinventing the Wheel
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"sage-trac" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sage-trac?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to