[Numpy-discussion] NumPy Development Meeting Wednesday - Triage Focus

2021-02-23 Thread Sebastian Berg
Hi all, Our bi-weekly triage-focused NumPy development meeting is Wednesday, Feb 24th at 11 am Pacific Time (19:00 UTC). Everyone is invited to join in and edit the work-in-progress meeting topics and notes: https://hackmd.io/68i_JvOYQfy9ERiHgXMPvg I encourage everyone to notify us of issues or

Re: [Numpy-discussion] C-coded dot 1000x faster than numpy?

2021-02-23 Thread Charles R Harris
On Tue, Feb 23, 2021 at 5:47 PM Charles R Harris wrote: > > > On Tue, Feb 23, 2021 at 11:10 AM Neal Becker wrote: > >> I have code that performs dot product of a 2D matrix of size (on the >> order of) [1000,16] with a vector of size [1000]. The matrix is >> float64 and the vector is

Re: [Numpy-discussion] C-coded dot 1000x faster than numpy?

2021-02-23 Thread Charles R Harris
On Tue, Feb 23, 2021 at 11:10 AM Neal Becker wrote: > I have code that performs dot product of a 2D matrix of size (on the > order of) [1000,16] with a vector of size [1000]. The matrix is > float64 and the vector is complex128. I was using numpy.dot but it > turned out to be a bottleneck. > >

Re: [Numpy-discussion] C-coded dot 1000x faster than numpy?

2021-02-23 Thread Carl Kleffner
The stackoverflow link above contains a simple testcase: >>> from scipy.linalg import get_blas_funcs>>> gemm = get_blas_funcs("gemm", >>> [X, Y])>>> np.all(gemm(1, X, Y) == np.dot(X, Y))True It would be of interest to benchmark gemm against np.dot. Maybe np.dot doesn't use blas at al for

Re: [Numpy-discussion] C-coded dot 1000x faster than numpy?

2021-02-23 Thread David Menéndez Hurtado
On Tue, 23 Feb 2021, 7:41 pm Roman Yurchak, wrote: > For the first benchmark apparently A.dot(B) with A real and B complex is > a known issue performance wise https://github.com/numpy/numpy/issues/10468 I splitted B into a vector of size (N, 2) for the real and imaginary part, and that makes

Re: [Numpy-discussion] C-coded dot 1000x faster than numpy?

2021-02-23 Thread Neal Becker
I'm using fedora 33 standard numpy. ldd says: /usr/lib64/python3.9/site-packages/numpy/core/_multiarray_umath.cpython-39-x86_64-linux-gnu.so: linux-vdso.so.1 (0x7ffdd1487000) libflexiblas.so.3 => /lib64/libflexiblas.so.3 (0x7f0512787000) So whatever flexiblas is doing controls blas. On

Re: [Numpy-discussion] C-coded dot 1000x faster than numpy?

2021-02-23 Thread Carl Kleffner
https://stackoverflow.com/questions/19839539/how-to-get-faster-code-than-numpy-dot-for-matrix-multiplication maybe C_CONTIGUOUS vs F_CONTIGUOUS? Carl Am Di., 23. Feb. 2021 um 19:52 Uhr schrieb Neal Becker : > One suspect is that it seems the numpy version was multi-threading. > This isn't

Re: [Numpy-discussion] C-coded dot 1000x faster than numpy?

2021-02-23 Thread Neal Becker
One suspect is that it seems the numpy version was multi-threading. This isn't useful here, because I'm running parallel monte-carlo simulations using all cores. Perhaps this is perversely slowing things down? I don't know how to account for 1000x slowdown though. On Tue, Feb 23, 2021 at 1:40

Re: [Numpy-discussion] C-coded dot 1000x faster than numpy?

2021-02-23 Thread Roman Yurchak
For the first benchmark apparently A.dot(B) with A real and B complex is a known issue performance wise https://github.com/numpy/numpy/issues/10468 In general, it might be worth trying different BLAS backends. For instance, if you install numpy from conda-forge you should be able to switch

Re: [Numpy-discussion] C-coded dot 1000x faster than numpy?

2021-02-23 Thread Andrea Gavana
Hi, On Tue, 23 Feb 2021 at 19.11, Neal Becker wrote: > I have code that performs dot product of a 2D matrix of size (on the > order of) [1000,16] with a vector of size [1000]. The matrix is > float64 and the vector is complex128. I was using numpy.dot but it > turned out to be a bottleneck. >

[Numpy-discussion] C-coded dot 1000x faster than numpy?

2021-02-23 Thread Neal Becker
I have code that performs dot product of a 2D matrix of size (on the order of) [1000,16] with a vector of size [1000]. The matrix is float64 and the vector is complex128. I was using numpy.dot but it turned out to be a bottleneck. So I coded dot2x1 in c++ (using xtensor-python just for the

Re: [Numpy-discussion] NEP: array API standard adoption (NEP 47)

2021-02-23 Thread Ralf Gommers
On Mon, Feb 22, 2021 at 7:49 PM Sebastian Berg wrote: > On Sun, 2021-02-21 at 17:30 +0100, Ralf Gommers wrote: > > Hi all, > > > > Here is a NEP, written together with Stephan Hoyer and Aaron Meurer, > > for > > discussion on adoption of the array API standard ( > >

Re: [Numpy-discussion] ENH: Proposal to add KML_BLAS support

2021-02-23 Thread Ralf Gommers
On Tue, Feb 23, 2021 at 1:42 PM ChunLin Fang wrote: > Thanks for asking, this is a simple explanation for your questions: > 1. The download link of KML_BLAS: > The Chinese page is > https://www.huaweicloud.com/kunpeng/software/KML_BLAS.html > The English page is >

Re: [Numpy-discussion] ENH: Proposal to add KML_BLAS support

2021-02-23 Thread ChunLin Fang
Thanks for asking, this is a simple explanation for your questions: 1. The download link of KML_BLAS: The Chinese page is https://www.huaweicloud.com/kunpeng/software/KML_BLAS.html The English page is https://kunpeng.huawei.com/en/#/developer/devkit/library, you can find a "Math Library"

Re: [Numpy-discussion] NEP 48: Spending NumPy Project funds

2021-02-23 Thread Ralf Gommers
On Mon, Feb 22, 2021 at 9:34 PM Stephan Hoyer wrote: > On Mon, Feb 22, 2021 at 4:08 AM Pearu Peterson > wrote: > >> Hi, >> >> See GH discussion starting at >> https://github.com/numpy/numpy/pull/18454#discussion_r579967791 for the >> raised issue that is now moved here. >> >> Re "Compensating