Hi!

I'm trying to use dgemm, zgemm and friends from scipy.linalg.blas to multiply matrices efficiently. As an example, I'd like to do:

    c += a.dot(b)

using whatever BLAS scipy is linked to and I want to avoid copies of large matrices. This works the way I want it:

>>> import numpy as np
>>> from scipy.linalg.blas import dgemm
>>> a = np.ones((2, 3), order='F')
>>> b = np.ones((3, 4), order='F')
>>> c = np.zeros((2, 4), order='F')
>>> dgemm(1.0, a, b, 1.0, c, 0, 0, 1)
array([[3., 3., 3., 3.],
       [3., 3., 3., 3.]])
>>> print(c)
[[3. 3. 3. 3.]
 [3. 3. 3. 3.]]

but if c is not contiguous, then c is not overwritten:

>>> c = np.zeros((7, 4), order='F')[:2, :]
>>> dgemm(1.0, a, b, 1.0, c, 0, 0, 1)
array([[3., 3., 3., 3.],
       [3., 3., 3., 3.]])
>>> print(c)
[[0. 0. 0. 0.]
 [0. 0. 0. 0.]]

Which is also what the docs say, but I think the raw BLAS function dgemm could do the update of c in-place by setting LDC=7. See here:

    http://www.netlib.org/lapack/explore-html/d7/d2b/dgemm_8f.html

Is there a way to call the raw BLAS function from Python?

I found this capsule thing, but I don't know if there is a way to call that (maybe using ctypes):

>>> from scipy.linalg import cython_blas
>>> cython_blas.__pyx_capi__['dgemm']
<capsule object "void (char *, char *, int *, int *, int *, __pyx_t_5scipy_6linalg_11cython_blas_d *, __pyx_t_5scipy_6linalg_11cython_blas_d *, int *, __pyx_t_5scipy_6linalg_11cython_blas_d *, int *, __pyx_t_5scipy_6linalg_11cython_blas_d *, __pyx_t_5scipy_6linalg_11cython_blas_d *, int *)" at 0x7f06fe1d2ba0>

Best,
Jens Jørgen
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Reply via email to