If you're going to provide routines for structure determination it might be worth looking at algorithms that can identify more general or less obvious structure as well. SymPy's matrices module needs a lot of work and is improving a lot which will become noticeable over the next few releases but one of the important optimisations being used is Tarjan's algorithm for finding the strongly connected components of a graph. This is a generalisation of checking for triangular or diagonal matrices. With this approach you can identify any permutation of the rows and columns of a square matrix that can bring it into block triangular or block diagonal form which can reduce many O(n**3) algorithms substantially. The big-O for Tarjan's algorithm itself is basically the same as checking whether a matrix is triangular/diagonal.
For example the matrix determinant is invariant under permutations of the rows and columns. If you can permute a matrix into block triangular form then the determinant is just the product of the determinants of the diagonal blocks. If the base case algorithm has n**3 operations then reducing it to two operations of size n/2 is a speed up of ~4x. In the extreme this discovers that a matrix is triangular and reduces the whole operation to O(n) (plus the cost of Tarjan's algorithm). However the graph-based approach also benefits wider classes e.g. you get almost all the same benefit for a matrix that is almost diagonal but has a few off-diagonal elements. Using sympy master branch (as .strongly_connected_components() is not released yet): In [19]: from sympy import Matrix In [20]: M = Matrix([[1, 0, 2, 0], [9, 3, 1, 2], [3, 0, 4, 0], [5, 8, 6, 7]]) In [21]: M Out[21]: ⎡1 0 2 0⎤ ⎢9 3 1 2⎥ ⎢3 0 4 0⎥ ⎣5 8 6 7⎦ In [22]: M.strongly_connected_components() # Tarjan's algorithm Out[22]: [[0, 2], [1, 3]] In [23]: M[[0, 2, 1, 3], [0, 2, 1, 3]] # outer indexing for permutation Out[23]: ⎡1 2 0 0⎤ ⎢3 4 0 0⎥ ⎢9 1 3 2⎥ ⎣5 6 8 7⎦ In [24]: M.det() Out[24]: -10 In [25]: M[[0,2],[0,2]].det() * M[[1, 3], [1, 3]].det() Out[25]: -10 -- Oscar On Fri, 2 Jul 2021 at 12:20, Ilhan Polat <ilhanpo...@gmail.com> wrote: > > Ah right. So two things, the original reason f9r this question is because I > can't decide in https://github.com/scipy/scipy/pull/12824 whether others > would also benefit from quick structure determination. > > I can keep it private function or we can put them some misc or lib folder so > all can use. Say there is a special method for triangular matrices but you > can't guarantee the structure so you can quickly check for it. At worst > O(n**2) complexity for diagonal arrays and almost O(2n) for full arrays makes > it quite appealing. > > But then again maybe NumPy is a better place since probably it will be faster > to have this in pure C with the right headers and without the extra Cython > overhead. > > Funny you mention the container idea. This is precisely what I'm doing in PR > mentioned above (I'll push when I'm done). I stole the idea from Tim Davis > himself in a Julia discussion for keeping the factorization as an attribute > to be used later if need be. So yes it makes a lot of sense Sparse or not. > > On Wed, 30 Jun 2021, 19:14 Evgeni Burovski, <evgeny.burovs...@gmail.com> > wrote: >> >> Hi Ilhan, >> >> Overall I think something like this would be great. However, I wonder >> if you considered having a specialized container with a structure tag >> instead of trying to discover the structure. If it's a container, it >> can neatly wrap various lapack storage schemes and dispatch to an >> appropriate lapack functionality. Possibly even sparse storage >> schemes. And it seems a bit more robust than trying to discover the >> structure (e.g. what about off-band elements of \sim 1e-16 etc). >> >> The next question is of course if this should live in scipy/numpy >> .linalg or as a separate repo, at least for some time (maybe in the >> scipy organization?). So that it can iterate faster, among other >> things. >> (I'd be interested in contributing FWIW) >> >> Cheers, >> >> Evgeni >> >> >> On Wed, Jun 30, 2021 at 1:22 AM Ilhan Polat <ilhanpo...@gmail.com> wrote: >> > >> > Dear all, >> > >> > I'm writing some helper Cythpm functions for scipy.linalg which is kinda >> > performant and usable. And there is still quite some wiggle room for more. >> > >> > In many linalg routines there is a lot of performance benefit if the >> > structure can be discovered in a cheap and reliable way at the outset. For >> > example if symmetric then eig can delegate to eigh or if triangular then >> > triangular solvers can be used in linalg.solve and lstsq so forth >> > >> > Here is the Cythonized version for Jupyter notebook to paste to discover >> > the lower/upper bandwidth of square array A that competes well with A != 0 >> > just to use some low level function (note the latter returns an array >> > hence more cost is involved) There is a higher level supervisor function >> > that checks C-contiguousness otherwise specializes to different versions >> > of it >> > >> > Initial cell >> > >> > %load_ext Cython >> > %load_ext line_profiler >> > import cython >> > import line_profiler >> > >> > Then another cell >> > >> > %%cython >> > # cython: language_level=3 >> > # cython: linetrace=True >> > # cython: binding = True >> > # distutils: define_macros=CYTHON_TRACE=1 >> > # distutils: define_macros=CYTHON_TRACE_NOGIL=1 >> > >> > cimport cython >> > cimport numpy as cnp >> > import numpy as np >> > import line_profiler >> > ctypedef fused np_numeric_t: >> > cnp.int8_t >> > cnp.int16_t >> > cnp.int32_t >> > cnp.int64_t >> > cnp.uint8_t >> > cnp.uint16_t >> > cnp.uint32_t >> > cnp.uint64_t >> > cnp.float32_t >> > cnp.float64_t >> > cnp.complex64_t >> > cnp.complex128_t >> > cnp.int_t >> > cnp.long_t >> > cnp.longlong_t >> > cnp.uint_t >> > cnp.ulong_t >> > cnp.ulonglong_t >> > cnp.intp_t >> > cnp.uintp_t >> > cnp.float_t >> > cnp.double_t >> > cnp.longdouble_t >> > >> > >> > @cython.linetrace(True) >> > @cython.initializedcheck(False) >> > @cython.boundscheck(False) >> > @cython.wraparound(False) >> > cpdef inline (int, int) band_check_internal(np_numeric_t[:, ::1]A): >> > cdef Py_ssize_t n = A.shape[0], lower_band = 0, upper_band = 0, r, c >> > cdef np_numeric_t zero = 0 >> > >> > for r in xrange(n): >> > # Only bother if outside the existing band: >> > for c in xrange(r-lower_band): >> > if A[r, c] != zero: >> > lower_band = r - c >> > break >> > >> > for c in xrange(n - 1, r + upper_band, -1): >> > if A[r, c] != zero: >> > upper_band = c - r >> > break >> > >> > return lower_band, upper_band >> > >> > Final cell for use-case --------------- >> > >> > # Make arbitrary lower-banded array >> > n = 50 # array size >> > k = 3 # k'th subdiagonal >> > R = np.zeros([n, n], dtype=np.float32) >> > R[[x for x in range(n)], [x for x in range(n)]] = 1 >> > R[[x for x in range(n-1)], [x for x in range(1,n)]] = 1 >> > R[[x for x in range(1,n)], [x for x in range(n-1)]] = 1 >> > R[[x for x in range(k,n)], [x for x in range(n-k)]] = 2 >> > >> > Some very haphazardly put together metrics >> > >> > %timeit band_check_internal(R) >> > 2.59 µs ± 84.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) >> > >> > %timeit np.linalg.solve(R, zzz) >> > 824 µs ± 6.24 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) >> > >> > %timeit R != 0. >> > 1.65 µs ± 43.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each) >> > >> > So the worst case cost is negligible in general (note that the given code >> > is slower as it uses the fused type however if I go with tempita >> > standalone version is faster) >> > >> > Two questions: >> > >> > 1) This is missing np.half/float16 functionality since any arithmetic with >> > float16 is might not be reliable including nonzero check. IS it safe to >> > view it as np.uint16 and use that specialization? I'm not sure about the >> > sign bit hence the question. I can leave this out since almost all linalg >> > suite rejects this datatype due to well-known lack of supprt. >> > >> > 2) Should this be in NumPy or SciPy linalg? It is quite relevant to be on >> > SciPy but then again this stuff is purely about array structures. But if >> > the opinion is for NumPy then I would need a volunteer because NumPy >> > codebase flies way above my head. >> > >> > >> > All feedback welcome >> > >> > Best >> > ilhan >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion@python.org >> > https://mail.python.org/mailman/listinfo/numpy-discussion >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion@python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion