Re: [Numpy-discussion] Behavior of np.random.multivariate_normal with bad covariance matrices
I like your idea Josef, I'll add it to the PR. Just to be clear, we should have something like: Have a single check_valid keyword arg, which will default to warn, since that is the current behavior. It will check approximate symmetry, PSDness, and for NaN infs. Other options on the check_valid keyword arg will be ignore, and raise. What should happen when fix is passed for check_valid? Set negative eigenvalues to 0 and symmetrize the matrix? On Mon, Mar 30, 2015 at 8:34 AM, josef.p...@gmail.com wrote: On Sun, Mar 29, 2015 at 7:39 PM, Blake Griffith blake.a.griff...@gmail.com wrote: I have an open PR which lets users control the checks on the input covariance matrix. The matrix is required to be symmetric and positve semi-definite (PSD). The current behavior is that NumPy raises a warning if the matrix is not PSD, and does not even check for symmetry. I added a symmetry check, which raises a warning when the input is not symmetric. And added two keyword args which users can use to turn off the checks/warnings when the matrix is ill formed. So this would only cause another new warning to be raised in existing code. This is needed because sometimes the covariance matrix is only *almost* symmetric or PSD due to roundoff error. Thoughts? My only question is why is **exact** symmetry relevant? AFAIU A empirical covariance matrix might not be exactly symmetric unless we specifically force it to be. But I don't see why some roundoff errors that violate symmetry should be relevant. use allclose with floating point rtol or equivalent? Some user code might suddenly get irrelevant warnings. BTW: neg = (np.sum(u.T * v, axis=1) 0) (s 0) doesn't need to be calculated if cov_psd is false. - some more: svd can hang if the values are not finite, i.e. nan or infs counter proposal would be to add a `check_valid` keyword with option ignore. warn, raise, and fix and raise an error if there are nans and check_valid is not ignore. - aside: np.random.multivariate_normal is only relevant if you have a new cov each call (or don't mind repeated possibly expensive calculations), so, I guess, adding checks by default won't upset many users. Josef PR: https://github.com/numpy/numpy/pull/5726 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Behavior of np.random.multivariate_normal with bad covariance matrices
On Sun, Mar 29, 2015 at 7:39 PM, Blake Griffith blake.a.griff...@gmail.com wrote: I have an open PR which lets users control the checks on the input covariance matrix. The matrix is required to be symmetric and positve semi-definite (PSD). The current behavior is that NumPy raises a warning if the matrix is not PSD, and does not even check for symmetry. I added a symmetry check, which raises a warning when the input is not symmetric. And added two keyword args which users can use to turn off the checks/warnings when the matrix is ill formed. So this would only cause another new warning to be raised in existing code. This is needed because sometimes the covariance matrix is only *almost* symmetric or PSD due to roundoff error. Thoughts? My only question is why is **exact** symmetry relevant? AFAIU A empirical covariance matrix might not be exactly symmetric unless we specifically force it to be. But I don't see why some roundoff errors that violate symmetry should be relevant. use allclose with floating point rtol or equivalent? Some user code might suddenly get irrelevant warnings. BTW: neg = (np.sum(u.T * v, axis=1) 0) (s 0) doesn't need to be calculated if cov_psd is false. - some more: svd can hang if the values are not finite, i.e. nan or infs counter proposal would be to add a `check_valid` keyword with option ignore. warn, raise, and fix and raise an error if there are nans and check_valid is not ignore. - aside: np.random.multivariate_normal is only relevant if you have a new cov each call (or don't mind repeated possibly expensive calculations), so, I guess, adding checks by default won't upset many users. Josef PR: https://github.com/numpy/numpy/pull/5726 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Behavior of np.random.multivariate_normal with bad covariance matrices
I have an open PR which lets users control the checks on the input covariance matrix. The matrix is required to be symmetric and positve semi-definite (PSD). The current behavior is that NumPy raises a warning if the matrix is not PSD, and does not even check for symmetry. I added a symmetry check, which raises a warning when the input is not symmetric. And added two keyword args which users can use to turn off the checks/warnings when the matrix is ill formed. So this would only cause another new warning to be raised in existing code. This is needed because sometimes the covariance matrix is only *almost* symmetric or PSD due to roundoff error. Thoughts? PR: https://github.com/numpy/numpy/pull/5726 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion