Re: [sage-support] Re: Matrix_generic_dense.det() is quite slow compared to Numpy

Thierry Dumont Mon, 09 Jan 2023 00:46:33 -0800

When doing numérical computations, you must use RDF floats, except forvery special purpose. RDF is your processor's float. The default RealField, based on MPFR (which by default as the same precision as RDF) isslow as it is based on MPFR, a very good library, but MPFR is a softwarebased library.When you make elementary linear algebra computations (factorizationeigenvalues, ...) with RDF floats, Sage uses the Lapack library based onOpenBlas (Basic linear algebra subroutine), which is optimized for yourcomputer. These libraries use all your processor capacities (like thevectorization with avx(2) instructions).It is extremely difficult to compute faster than Lapack + OpenBlas (itcould be a bit faster with Intel own Blas, on Intels's processors...justa bit).

To see this just compute an LU factorization:

sage: A=random_matrix(RDF,1000,1000)
sage: %time Q=A.LU()
The cost for an LU factorization is n^3 for a matrix of size n.

On my old processor (3 Ghz, avx), it takes 53.2 ms, so factorization iscomputed with a speed of about 18 Gigafloats, which is impressive.


 You must use RealField with a large precision (say RealField(prec=500))

if you want to solve very ill conditioned problems (example: Hilbertmatrix Aij= 1./(1.+i+j) with size n >20), but then you have to "pay"with a slow computation.


TINA (there is no alternative :-( ), sadly.

yours,
t.d.

Le 09/01/2023 à 06:19, John H Palmieri a écrit :

I am not at all an expert in numerical linear algebra, but (a) certainlyparts of Sage can be improved and (b) contributions are welcome. Otherscan speak more knowledgeably about this particular issue.


On Sunday, January 8, 2023 at 11:45:00 AM UTC-8 [email protected] wrote:

    Is there any space to improve the computation of MPFR which is used
    by RealField?

    John H Palmieri 在 2023年1月6日 星期五凌晨3:01:37 [UTC+8] 的信中寫道：

        One way to speed it up might be to work with RDF or QQ or RLF.
        The documentation for the generic determinant method says, "Note
        that for matrices over most rings, more sophisticated algorithms
        can be used."

        sage: %time ones_matrix(RDF, 600, 600).determinant()
        CPU times: user 78.6 ms, sys: 7.6 ms, total: 86.2 ms
        Wall time: 77.9 ms
        0.0
        sage: %time ones_matrix(QQ, 600, 600).determinant()
        CPU times: user 280 ms, sys: 36.3 ms, total: 317 ms
        Wall time: 325 ms
        0
        sage: %time ones_matrix(RLF, 600, 600).determinant()
        CPU times: user 32.7 s, sys: 183 ms, total: 32.9 s
        Wall time: 40.3 s
        0


        On Thursday, January 5, 2023 at 1:26:12 AM UTC-8
        [email protected] wrote:

            I am doing a numerical analysis which needs to calculate
            many determinants of "huge" dense matrices. The sizes of the
            matrix are less than 100*100.

            I found the computation of the determinant is quite slow
            with Matrix_generic_dense. I followed the method on What is
            the time complexity of numpy.linalg.det?
            
<https://stackoverflow.com/questions/72206172/what-is-the-time-complexity-of-numpy-linalg-det>
 and made minor modifications to the script to benchmark for Sagemath.
            1. np.arange(1,10001, 100) => np.arange(1, 292, 10) because
            the speed is too slow, I have to reduce the testing range to
            make the script finish within a reasonable time.
            2. np.ones((size, size))  => ones_matrix(RR, size, size)
            3. timeit('np.linalg.det(A)', globals={'np':np, 'A':A},
            number=1) => timeit('A.det(algorithm="df")', globals={'A':
            A}, number=1)

            Here is the full script I used.

            from timeit import timeit

            import matplotlib.pyplot as plt
            import numpy as np
            from sage.rings.real_mpfr import RR
            from sklearn.linear_model import LinearRegression
            from sklearn.preprocessing import PolynomialFeatures

            sizes = np.arange(1, 292, 10)
            times = []

            for size in sizes:
                 A = ones_matrix(RR, size, size)
                 time = timeit('A.det(algorithm="df")', globals={'A':
            A}, number=1)
                 times.append(time)
                 print(size, time)

            sizes = sizes.reshape(-1, 1)
            times = np.array(times).reshape(-1, 1)

            quad_sizes = PolynomialFeatures(degree=2).fit_transform(sizes)
            quad_times = LinearRegression().fit(quad_sizes,
            times).predict(quad_sizes)
            cubic_sizes = PolynomialFeatures(degree=3).fit_transform(sizes)
            cubic_times = LinearRegression().fit(cubic_sizes,
            times).predict(cubic_sizes)
            quartic_sizes =
            PolynomialFeatures(degree=4).fit_transform(sizes)
            quartic_times = LinearRegression().fit(quartic_sizes,

times).predict(quartic_sizes)


            plt.scatter(sizes, times, label='Data', color='k', alpha=0.5)
            plt.plot(sizes, quad_times, label='Quadratic', color='r')
            plt.plot(sizes, cubic_times, label='Cubic', color='g')
            plt.plot(sizes, quartic_times, label='Quartic', color='b')
            plt.xlabel('Matrix Dimension n')
            plt.ylabel('Time (seconds)')
            plt.legend()
            plt.show()

            This is the time complexity diagram on the linked post. The
            data from the testing with Numpy. It only took about 20
            seconds to calculate a 10000*10000 matrix.
            numpy.pngThis is the testing with matrix(RR) with the df
            algorithm.
            df.png
            I also tested with the "hessenberg" algorithm which got worse.
            df.png
            Compared to Numpy, Sagemath spent 800 seconds calculating a
            300*300 matrix. It seems that Sagemath still runs the
            algorithm under O(n^3) (the curves of the cubic model and
            the quartic model overlap) but with a large constant. Why
            Sagemath is so slow? Is it possible to speed up?

--

You received this message because you are subscribed to the GoogleGroups "sage-support" group.To unsubscribe from this group and stop receiving emails from it, sendan email to [email protected]<mailto:[email protected]>.To view this discussion on the web visithttps://groups.google.com/d/msgid/sage-support/8b81830e-d76f-4554-b4cd-7f62a47ba32fn%40googlegroups.com <https://groups.google.com/d/msgid/sage-support/8b81830e-d76f-4554-b4cd-7f62a47ba32fn%40googlegroups.com?utm_medium=email&utm_source=footer>.


--
You received this message because you are subscribed to the Google Groups 
"sage-support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sage-support/91068a61-ff2f-df1a-4b42-76991d4fcc78%40math.univ-lyon1.fr.

Re: [sage-support] Re: Matrix_generic_dense.det() is quite slow compared to Numpy

Reply via email to