#16116: Multiplication of dense cyclotomic matrices should be faster
-------------------------------------------------+-------------------------
Reporter: jipilab | Owner:
Type: enhancement | Status: new
Priority: major | Milestone: sage-6.4
Component: number fields | Resolution:
Keywords: cyclotomic field, matrix, | Merged in:
multiplication, benchmark, days57 | Reviewers:
Authors: | Work issues:
Report Upstream: N/A | Commit:
Branch: | Stopgaps:
Dependencies: |
-------------------------------------------------+-------------------------
Old description:
> The multiplication of matrices with a (universal) cyclotomic fields as
> base ring could be optimized as the following profiling shows:
>
> {{{
> sage: x,y=var('x,y')
> sage: PR=PolynomialRing(QQ,[x,y])
> sage: I=Ideal(x^2 - 1/2*x - 1/4, y^3 - 1/2*y^2 - 1/2*y + 1/8)
> sage: Q=PR.quotient(I)
> sage: a,b=Q.gens()
> sage: elmt=matrix(Q,[[-1, 1, 2*x],[-4*x*y - 1, 4*x*y + 4*y^2, 4*x*y +
> 2*x],[-2*x, 2*x + 2*y, 2*x]])
> sage: %timeit elmt^2
> 1000 loops, best of 3: 1.17 ms per loop
>
> sage: UCF.<E>=UniversalCyclotomicField()
> sage: ae=(E(10)+~E(10))/2 #same value as a
> sage: be=(E(14)+~E(14))/2 #same value as b
> sage: m=matrix(UCF,[[-1, 1, 2*ae],[-4*ae*be - 1, 4*ae*be + 4*be^2,
> 4*ae*be + 2*ae],[-2*ae, 2*ae + 2*be, 2*ae]])
> sage: %timeit m^2
> 100 loops, best of 3: 8.13 ms per loop
>
> sage: CF.<F>=CyclotomicField(2*5*7)
> sage: af=(F^7+~F^7)/2 #same value as a
> sage: bf=(F^5+~F^5)/2 #same value as b
> sage: m2=matrix(CF,[[-1, 1, 2*af],[-4*af*bf - 1, 4*af*bf + 4*bf^2,
> 4*af*bf + 2*af],[-2*af, 2*af + 2*bf, 2*af]])
> sage: %timeit m2^2
> 100 loops, best of 3: 4.99 ms per loop
> }}}
>
> The three matrices elmt, m and m2 are the same encoded into 3 different
> base rings. It would be natural to think that the cyclotomic field be the
> optimal field to do computations, but it does not seem to be the case in
> practice.
>
> Here is a univariate example.
>
> {{{
> sage: PR=PolynomialRing(QQ,[x])
> sage: I=Ideal(x^2 - 1/2*x - 1/4)
> sage: Q=PR.quotient(I)
> sage: elmt_uni=matrix(Q,[[-2*x, 1, 6*x + 2],[-2*x, 2*x, 4*x +
> 1],[0,0,1]])
> sage: %timeit elmt_uni*elmt_uni
> 1000 loops, best of 3: 1.46 ms per loop
>
> sage: CF.<F>=CyclotomicField(2*5)
> sage: f5=(F+~F)/2
> sage: m=matrix(CF,[[-2*f5, 1, 6*f5 + 2],[-2*f5, 2*f5, 4*f5 +
> 1],[0,0,1]])
> sage: type(m)
> <type 'sage.matrix.matrix_cyclo_dense.Matrix_cyclo_dense'>
> sage: m.parent()
> Full MatrixSpace of 3 by 3 dense matrices over Cyclotomic Field of
> order 10 and degree 4
> sage: %timeit m*m
> 100 loops, best of 3: 1.98 ms per loop
> }}}
>
> Then, I disactivated the verification on cyclotomic fields on line 962 of
> the file /src/sage/matrix/matrix_space.py to get a matrix_generic_dense
> instead of matrix_cyclo_dense.
>
> {{{
> sage: CF.<F>=CyclotomicField(2*5)
> sage: f5=(F+~F)/2
> sage: m=matrix(CF,[[-2*f5, 1, 6*f5 + 2],[-2*f5, 2*f5, 4*f5 +
> 1],[0,0,1]])
> sage: m.parent()
> Full MatrixSpace of 3 by 3 dense matrices over Cyclotomic Field of
> order 10 and degree 4
> sage: type(m)
> <type 'sage.matrix.matrix_generic_dense.Matrix_generic_dense'>
> sage: %timeit m*m
> 1000 loops, best of 3: 251 µs per loop
> }}}
>
> The gain is significant. Is there a known use cases where the specialized
> implementation is faster than the generic one? If yes, should we make
> some threshold test to choose between the two implementations?
New description:
The multiplication of matrices with a (universal) cyclotomic fields as
base ring could be optimized as the following profiling shows:
{{{
sage: def make_matrix1(R,a,b):
....: return matrix(R, 3, [[-1, 1, 2*a],
....: [-4*a*b - 1, 4*a*b + 4*b^2, 4*a*b + 2*a],
....: [-2*a, 2*a + 2*b, 2*a]])
sage: PR.<x,y> = PolynomialRing(QQ)
sage: I = Ideal(x^2 - 1/2*x - 1/4, y^3 - 1/2*y^2 - 1/2*y + 1/8)
sage: Q = PR.quotient(I)
sage: elmt = make_matrix1(Q, x, y)
sage: %timeit elmt^2
1000 loops, best of 3: 1.17 ms per loop
sage: UCF.<E> = UniversalCyclotomicField()
sage: ae = (E(10)+~E(10))/2 #same value as a
sage: be = (E(14)+~E(14))/2 #same value as b
sage: m = make_matrix1(UCF, ae, be)
sage: %timeit m^2
100 loops, best of 3: 8.13 ms per loop
sage: CF.<F> = CyclotomicField(2*5*7)
sage: af = (F^7+~F^7)/2 #same value as a
sage: bf = (F^5+~F^5)/2 #same value as b
sage: m2 = make_matrix1(CF, af, bf)
sage: %timeit m2^2
100 loops, best of 3: 4.99 ms per loop
}}}
The three matrices elmt, m and m2 are the same encoded into 3 different
base rings. It would be natural to think that the cyclotomic field be the
optimal field to do computations, but it does not seem to be the case in
practice.
Here is a univariate example.
{{{
sage: def make_matrix2(R, a):
....: return matrix(R, 3, [[-2*a, 1, 6*a+2],
....: [-2*a, 2*a, 4*a+1],
....: [0, 0, 1]])
sage: PR.<x> = PolynomialRing(QQ)
sage: I = Ideal(x^2 - 1/2*x - 1/4)
sage: Q = PR.quotient(I)
sage: elmt_uni = make_matrix2(Q, x)
sage: %timeit elmt_uni*elmt_uni
1000 loops, best of 3: 1.46 ms per loop
sage: CF.<F> = CyclotomicField(2*5)
sage: f5 = (F+~F)/2
sage: m = make_matrix2(CF, f5)
sage: type(m)
<type 'sage.matrix.matrix_cyclo_dense.Matrix_cyclo_dense'>
sage: m.parent()
Full MatrixSpace of 3 by 3 dense matrices over
Cyclotomic Field of order 10 and degree 4
sage: %timeit m*m
100 loops, best of 3: 1.98 ms per loop
}}}
Then, I disactivated the verification on cyclotomic fields on line 962 of
the file /src/sage/matrix/matrix_space.py to get a matrix_generic_dense
instead of matrix_cyclo_dense.
{{{
sage: CF.<F> = CyclotomicField(2*5)
sage: f5 = (F+~F)/2
sage: m = make_matrix2(CF, f5)
sage: m.parent()
Full MatrixSpace of 3 by 3 dense matrices over
Cyclotomic Field of order 10 and degree 4
sage: type(m)
<type 'sage.matrix.matrix_generic_dense.Matrix_generic_dense'>
sage: %timeit m*m
1000 loops, best of 3: 251 µs per loop
}}}
The gain is significant. Is there a known use cases where the specialized
implementation is faster than the generic one? If yes, should we make some
threshold test to choose between the two implementations?
--
Comment (by vdelecroix):
Hello,
I reformatted your example such that they fit in less lines (it can easily
switched back to your original version if you do not like it).
I had a quick look at the code for dense cyclotomic matrices. The
implementation is quite old and uses a lot of reduction mod p (even for
multiplication). The code calls a lot of Python code like creating a
finite field, creating a matrix space, etc which are relatively slow
compared to a small matrix multiplication. Did you try multiplying larger
matrices (i.e. 10x10 or 15x15)? On the other hand, I am pretty sure that
some cleaning could be of great speed up. By cleaning I mean:
- declare `cdef` variables wherever possible
- let as minimum as possible `import` inside the methods
- ...
You can also do some profiling on the code (using "%prun" and "%crun"),
see #17689 which is not yet in the current development release.
Vincent
--
Ticket URL: <http://trac.sagemath.org/ticket/16116#comment:7>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica,
and MATLAB
--
You received this message because you are subscribed to the Google Groups
"sage-trac" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sage-trac.
For more options, visit https://groups.google.com/d/optout.