I like your idea Josef, I'll add it to the PR. Just to be clear, we should have something like:
Have a single "check_valid" keyword arg, which will default to warn, since that is the current behavior. It will check approximate symmetry, PSDness, and for NaN & infs. Other options on the check_valid keyword arg will be ignore, and raise. What should happen when "fix" is passed for check_valid? Set negative eigenvalues to 0 and symmetrize the matrix? On Mon, Mar 30, 2015 at 8:34 AM, <josef.p...@gmail.com> wrote: > On Sun, Mar 29, 2015 at 7:39 PM, Blake Griffith > <blake.a.griff...@gmail.com> wrote: > > I have an open PR which lets users control the checks on the input > > covariance matrix. The matrix is required to be symmetric and positve > > semi-definite (PSD). The current behavior is that NumPy raises a warning > if > > the matrix is not PSD, and does not even check for symmetry. > > > > I added a symmetry check, which raises a warning when the input is not > > symmetric. And added two keyword args which users can use to turn off the > > checks/warnings when the matrix is ill formed. So this would only cause > > another new warning to be raised in existing code. > > > > This is needed because sometimes the covariance matrix is only *almost* > > symmetric or PSD due to roundoff error. > > > > Thoughts? > > My only question is why is **exact** symmetry relevant? > > AFAIU > A empirical covariance matrix might not be exactly symmetric unless we > specifically force it to be. But I don't see why some roundoff errors > that violate symmetry should be relevant. > > use allclose with floating point rtol or equivalent? > > Some user code might suddenly get irrelevant warnings. > > BTW: > neg = (np.sum(u.T * v, axis=1) < 0) & (s > 0) > doesn't need to be calculated if cov_psd is false. > > ----- > > some more: > > svd can hang if the values are not finite, i.e. nan or infs > > counter proposal would be to add a `check_valid` keyword with option > ignore. warn, raise, and "fix" > > and raise an error if there are nans and check_valid is not ignore. > > --------- > > aside: > np.random.multivariate_normal is only relevant if you have a new cov > each call (or don't mind repeated possibly expensive calculations), > so, I guess, adding checks by default won't upset many users. > > > Josef > > > > > > > > PR: https://github.com/numpy/numpy/pull/5726 > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion@scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion