[Numpy-discussion] Segfault with QR Decomposition
I get a segmentation fault upon running the following: import numpy A = numpy.ones((700, 8)) Q, R = numpy.linalg.qr(A) on Python 2.7.3, Linux 64-bit using numpy 1.9.0.dev-ec3603f linked against OpenBLAS. If A is a smaller matrix then the QR decomposition works (for example A has shape (400, 8)). I haven't quite narrowed down the exact threshold where the crash occurs, however I know that the above A is 448 MB (Q and R are no bigger), and the machine in question has 32GB of RAM. I also tested scipy.linalg.qr (version 0.14.0.dev-ced994c) with the same results. I don't get the same problem on my laptop which is Python 2.7.3, Linux 64-bit but with numpy 1.8.0rc1 linked to OpenBLAS. Both machines have OpenBLAS 0.2.6. Does anyone have some insight into why this problem is occuring? Thanks very much for any help, Charanpal ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Segfault with QR Decomposition
/ I get a segmentation fault upon running the following: // // import numpy // A = numpy.ones((700, 8)) // Q, R = numpy.linalg.qr(A) // // on Python 2.7.3, Linux 64-bit using numpy 1.9.0.dev-ec3603f linked // against OpenBLAS. If A is a smaller matrix then the QR decomposition // works (for example A has shape (400, 8)). I haven't quite narrowed // down the exact threshold where the crash occurs, however I know that the // above A is 448 MB (Q and R are no bigger), and the machine in question // has 32GB of RAM. I also tested scipy.linalg.qr (version // 0.14.0.dev-ced994c) with the same results. // // I don't get the same problem on my laptop which is Python 2.7.3, Linux // 64-bit but with numpy 1.8.0rc1 linked to OpenBLAS. Both machines have // OpenBLAS 0.2.6. Does anyone have some insight into why this problem is // occuring? // // /Works fine here with '1.9.0.dev-7457f15' linked against ATLAS. I suspect problem is in OpenBLAS. What architecture/os do the two machines have? Is OpenBLAS using more than one core? Thanks for testing the code. On the desktop machine I am using Debian GNU/Linux 7.2 (wheezy) compiled for x86-64 (Intel(R) Xeon(R) CPU E5-1620), and on my laptop I use Ubuntu 13.04 for x86-64 (Intel(R) Core(TM) i7-3740QM). I tried: export OPENBLAS_NUM_THREADS=1 and export OPENBLAS_NUM_THREADS=1 with the same results (a segfault). Charanpal ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Segfault with QR Decomposition
Oops, the second line should have been export OPENBLAS_NUM_THREADS=8 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Eigenvalues did not converge
As Paul suggested I'd try compiling numpy with something other than the BLAS/LAPACK libraries currently in use. Here is a good place to start: http://www.scipy.org/Installing_SciPy/Linux. Charanpal On Thu, 1 Sep 2011 12:20:46 -0600, Rick Muller wrote: Yes, as I pointed out, the problem does run on the Macintosh systems. But Id like to be able to run these on our linux supercomputers. Surely this is possible, right? On Mon, Aug 29, 2011 at 9:31 AM, Paul Anton Letnes wrote: I recently got into trouble with these calculations (although I used scipy). I actually got segfaults and bus errors. The solution for me was to not link against ATLAS, but rather link against Apples blas/lapack libraries. That got everything working again. I would suggest trying to install against something other than ATLAS and see if that helps (or, more generally, determining which blas/lapack you are linking against, and try something else). Paul On 29. aug. 2011, at 16.21, Charanpal Dhanjal wrote: I posted a similar question about the non-convergence of numpy.linalg.svd a few weeks ago. Im not sure I can help but I wonder if you compiled numpy with ATLAS/MKL support (try numpy.show_config()) and whether it made a difference? Also what is the condition number and Frobenius norm of the matrix in question? Charanpal On Mon, 29 Aug 2011 08:56:31 -0600, Rick Muller wrote: Im bumping into the old Eigenvalues did not converge error using numpy.linalg.eigh() on several different linux builds of numpy (1.4.1). The matrix is 166x166. I can compute the eigenvalues on a Macintosh build of numpy, and I can confirm that there arent degenerate eigenvalues, and that the matrix appears to be negative definite. Ive seen this before (though not for several years), and what I normally do is to build lapack with -O0. This trick did not work in the current instance. Does anyone have any tricks to getting eigh to work? Other weird things that Ive noticed about this case: I can compute the eigenvalues using eigvals and eigvalsh, and can compute the eigenvals/vecs using eig(). The matrix is real symmetric, and Ive tested that its symmetric enough by forcibly symmetrizing it. Thanks in advance for any help you can offer. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org [1] http://mail.scipy.org/mailman/listinfo/numpy-discussion [2] ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org [3] http://mail.scipy.org/mailman/listinfo/numpy-discussion [4] -- Rick Muller rpmul...@gmail.com [6] 505-750-7557 Links: -- [1] mailto:NumPy-Discussion@scipy.org [2] http://mail.scipy.org/mailman/listinfo/numpy-discussion [3] mailto:NumPy-Discussion@scipy.org [4] http://mail.scipy.org/mailman/listinfo/numpy-discussion [5] mailto:paul.anton.let...@gmail.com [6] mailto:rpmul...@gmail.com ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Eigenvalues did not converge
I posted a similar question about the non-convergence of numpy.linalg.svd a few weeks ago. I'm not sure I can help but I wonder if you compiled numpy with ATLAS/MKL support (try numpy.show_config()) and whether it made a difference? Also what is the condition number and Frobenius norm of the matrix in question? Charanpal On Mon, 29 Aug 2011 08:56:31 -0600, Rick Muller wrote: Im bumping into the old Eigenvalues did not converge error using numpy.linalg.eigh() on several different linux builds of numpy (1.4.1). The matrix is 166x166. I can compute the eigenvalues on a Macintosh build of numpy, and I can confirm that there arent degenerate eigenvalues, and that the matrix appears to be negative definite. Ive seen this before (though not for several years), and what I normally do is to build lapack with -O0. This trick did not work in the current instance. Does anyone have any tricks to getting eigh to work? Other weird things that Ive noticed about this case: I can compute the eigenvalues using eigvals and eigvalsh, and can compute the eigenvals/vecs using eig(). The matrix is real symmetric, and Ive tested that its symmetric enough by forcibly symmetrizing it. Thanks in advance for any help you can offer. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] SVD does not converge on clean matrix
I had a quick look at the code (https://github.com/numpy/numpy/blob/master/numpy/linalg/linalg.py) and the numpy.linalg.svd function calls lapack_lite.dgesdd (for real matrices) so I guess the non-convergence occurs in this function. As I understood lapack_lite is used by default unless numpy is installed with ATLAS/MKL etc. I wonder why svd works for Nadav and not for anyone else? Any ideas anyone? Charanpal On Sat, 13 Aug 2011 13:13:25 -0600, Charles R Harris wrote: On Thu, Aug 11, 2011 at 7:23 AM, wrote: Hi all, I get an error message numpy.linalg.linalg.LinAlgError: SVD did not converge when calling numpy.linalg.svd on a clean matrix of size (1952, 895). The matrix is clean in the sense that it contains no NaN or Inf values. The corresponding npz file is available here: https://docs.google.com/leaf?id=0Bw0NXKxxc40jMWEyNTljMWUtMzBmNS00NGZmLThhZWUtY2I2MWU2MGZiNDgxhl=fr [1] Here is some information about my setup: I use Python 2.7.1 on Ubuntu 11.04 with numpy 1.6.1. Furthermore, I thought the problem might be solved by recompiling numpy with my local ATLAS library (version 3.8.3), and this didnt seem to help. On another machine with Python 2.7.1 and numpy 1.5.1 the SVD does converge however it contains 1 NaN singular value and 3 negative singular values of the order -10^-1 (singular values should always be non-negative). I also tried computing the SVD of the matrix using Octave 3.2.4 and Matlab 7.10.0.499 (R2010a) 64-bit (glnxa64) and there were no problems. Any help is greatly appreciated. Thanks in advance, Charanpal Fails here also, fedora 15 64 bits AMD 940. There should be a maximum iterations argument somewhere... Chuck Links: -- [1] https://docs.google.com/leaf?id=0Bw0NXKxxc40jMWEyNTljMWUtMzBmNS00NGZmLThhZWUtY2I2MWU2MGZiNDgx|+|amp|+|hl=fr [2] mailto:dhan...@telecom-paristech.fr ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] SVD does not converge on clean matrix
Thanks very much Lou for the information. I tried delving into the C code and found a line in the dlasd4_ routine which reads: for (niter = iter; niter = MAXITERLOOPS; ++niter) { This is apparently the main loop for this subroutine and the value of MAXITERLOOPS = 100. All I did was increase the maximum number of iterations to 200, and this seemed to solve the problem for the matrix in question. Let this matrix be called A, then P0, o0, Q0 = numpy.linalg.svd(A, full_matrices=False) numpy.linalg.norm((P0*o0).dot(Q0)- A) 1.8558089412794851 numpy.linalg.norm(A) 4.558649005154054 A.shape (1952, 895) It seems A has quite a small norm given its dimension, and perhaps this explains the error in the SVD (the numpy.linalg.norm((P0*o0).dot(Q0)- A) bit). To investigate a little further I tried finding the SVD of A*1000: P0, o0, Q0 = numpy.linalg.svd(A*1000, full_matrices=False) numpy.isfinite(Q0).all() False numpy.isfinite(P0).all() False numpy.isfinite(o0).all() False and hence increasing the number of iterations does not solve the problem in this case. That was about as far as I felt I could go with investigating the C code. In the meanwhile I will try the squaring the matrix solution. Incidentally, I am confused as to why numpy calls the lapack lite routines - when I call numpy.show_config() it seems to have detected my ATLAS libraries and I would have expected it to use those. Charanpal On Sun, 14 Aug 2011 07:27:06 -0700 (PDT), Lou Pecora wrote: Chuck wrote: - Fails here also, fedora 15 64 bits AMD 940. There should be a maximum iterations argument somewhere... Chuck --- *** Here's the FIX: Chuck is right. There is a max iterations. Here is a reply from a thread of mine in this group several years ago about this problem and some comments that might help you. From Mr. Damian Menscher who was kind enough to find the iteration location and provide some insight: Ok, so after several hours of trying to read that code, I found the parameter that needs to be tuned. In case anyone has this problem and finds this thread a year from now, here's your hint: File: Src/dlapack_lite.c Subroutine: dlasd4_ Line: 22562 There's a for loop there that limits the number of iterations to 20. Increasing this value to 50 allows my matrix to converge. I have not bothered to test what the best value for this number is, though. In any case, it appears the number just exists to prevent infinite loops, and 50 isn't really that much closer to infinity than 20 (Actually, I'm just going to set it to 100 so I don't have to think about it ever again.) Damian Menscher -- -=#| Physics Grad Student SysAdmin @ U Illinois Urbana-Champaign |#=- -=#| 488 LLP, 1110 W. Green St, Urbana, IL 61801 Ofc:(217)333-0038 |#=- -=#| 1412 DCL, Workstation Services Group, CITES Ofc:(217)244-3862 |#=- -=#| www.uiuc.edu/~menscher/ Fax:(217)333-9819 |#=- My reply and a fix of sorts without changing the hard coded iteration max: I have looked in Src/dlapack_lite.c and line 22562 is no longer a line that sets a max. iterations parameter. There are several set in the file, but that code is hard to figure (sort of a Fortran-in-C hybrid). Here's one, for example: maxit = *n * 6 * *n; // Line 887 I have no idea which parameter to tweak. Apparently this error is still in numpy (at least to my version). Does anyone have a fix? Should I start a ticket (I think this is what people do)? Any help appreciated. I'm using a Mac Book Pro (Intel chip), system 10.4.11, Python 2.4.4. Possible try/except === # A is the original matrix try: U,W,VT=linalg.svd(A) except linalg.linalg.LinAlgError: # Square the matrix and do SVD print Got svd except, trying square of A. A2=dot(conj(A.T),A) U,W2,VT=linalg.svd(A2) This works so far. --- I've been using that simple fix of squaring the original matrix for several years and it's worked every time. I'm not sure why. It was just a test and it worked. You could also change the underlying C or Fortran code, but you then have to recompile everything in numpy. I wasn't that brave. -- Lou Pecora, my views are my own. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] SVD does not converge on clean matrix
Thank Nadav for testing out the matrix. I wonder if you had a chance to check if the resulting decomposition contained NaN or Inf values? As far I understood, numpy.linalg.svd uses routines in LAPACK and ATLAS (if available) to compute the corresponding SVD. I did some complementary tests on Debian Squeeze on an Intel Xeon W3550 CPU and the call to numpy.linalg.svd results in the LinAlgError SVD did not converge, however the test leading to results containing NaN values ran on Debian Lenny on an Intel Core 2 Quad. In both of these situations we use Python 2.7.1 and numpy 1.5.1 (without ATLAS), and so the reasons for the differences seem to be OS or processor dependent. Any ideas? Charanpal Date: Thu, 11 Aug 2011 07:21:09 -0700 From: Nadav Horesh nad...@visionsense.com Subject: Re: [Numpy-discussion] SVD does not converge on clean matrix To: Discussion of Numerical Python numpy-discussion@scipy.org Message-ID: 26FC23E7C398A64083C980D16001012D246DFC5F90@VA3DIAXVS361.RED001.local Content-Type: text/plain; charset=us-ascii Had no problem on a gentoo 64 bit machine using atlas 3.8.0 (Core I7, python 2.7.2, numpy versions1.60 and 1.6.1) Nadav On Thu, 11 Aug 2011 15:23:22 +0200, dhan...@telecom-paristech.fr wrote: Hi all, I get an error message numpy.linalg.linalg.LinAlgError: SVD did not converge when calling numpy.linalg.svd on a clean matrix of size (1952, 895). The matrix is clean in the sense that it contains no NaN or Inf values. The corresponding npz file is available here: https://docs.google.com/leaf?id=0Bw0NXKxxc40jMWEyNTljMWUtMzBmNS00NGZmLThhZWUtY2I2MWU2MGZiNDgxhl=fr Here is some information about my setup: I use Python 2.7.1 on Ubuntu 11.04 with numpy 1.6.1. Furthermore, I thought the problem might be solved by recompiling numpy with my local ATLAS library (version 3.8.3), and this didn't seem to help. On another machine with Python 2.7.1 and numpy 1.5.1 the SVD does converge however it contains 1 NaN singular value and 3 negative singular values of the order -10^-1 (singular values should always be non-negative). I also tried computing the SVD of the matrix using Octave 3.2.4 and Matlab 7.10.0.499 (R2010a) 64-bit (glnxa64) and there were no problems. Any help is greatly appreciated. Thanks in advance, Charanpal ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion