Re: [Numpy-discussion] Behavior of np.random.multivariate_normal with bad covariance matrices

2015-04-08 Thread Blake Griffith
I like your idea Josef, I'll add it to the PR. Just to be clear, we should
have something like:

Have a single check_valid keyword arg, which will default to warn, since
that is the current behavior. It will check approximate symmetry, PSDness,
and for NaN  infs. Other options on the check_valid keyword arg will be
ignore, and raise.

What should happen when fix is passed for check_valid? Set negative
eigenvalues to 0 and symmetrize the matrix?

On Mon, Mar 30, 2015 at 8:34 AM, josef.p...@gmail.com wrote:

 On Sun, Mar 29, 2015 at 7:39 PM, Blake Griffith
 blake.a.griff...@gmail.com wrote:
  I have an open PR which lets users control the checks on the input
  covariance matrix. The matrix is required to be symmetric and positve
  semi-definite (PSD). The current behavior is that NumPy raises a warning
 if
  the matrix is not PSD, and does not even check for symmetry.
 
  I added a symmetry check, which raises a warning when the input is not
  symmetric. And added two keyword args which users can use to turn off the
  checks/warnings when the matrix is ill formed. So this would only cause
  another new warning to be raised in existing code.
 
  This is needed because sometimes the covariance matrix is only *almost*
  symmetric or PSD due to roundoff error.
 
  Thoughts?

 My only question is why is **exact** symmetry relevant?

 AFAIU
 A empirical covariance matrix might not be exactly symmetric unless we
 specifically force it to be. But I don't see why some roundoff errors
 that violate symmetry should be relevant.

 use allclose with floating point rtol or equivalent?

 Some user code might suddenly get irrelevant warnings.

 BTW:
 neg = (np.sum(u.T * v, axis=1)  0)  (s  0)
 doesn't need to be calculated if cov_psd is false.

 -

 some more:

 svd can hang if the values are not finite, i.e. nan or infs

 counter proposal would be to add a `check_valid` keyword with option
 ignore. warn, raise, and fix

 and raise an error if there are nans and check_valid is not ignore.

 -

 aside:
 np.random.multivariate_normal   is only relevant if you have a new cov
 each call (or don't mind repeated possibly expensive calculations),
 so, I guess, adding checks by default won't upset many users.


 Josef


 
 
  PR: https://github.com/numpy/numpy/pull/5726
 
  ___
  NumPy-Discussion mailing list
  NumPy-Discussion@scipy.org
  http://mail.scipy.org/mailman/listinfo/numpy-discussion
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Behavior of np.random.multivariate_normal with bad covariance matrices

2015-03-29 Thread Blake Griffith
I have an open PR which lets users control the checks on the input
covariance matrix. The matrix is required to be symmetric and positve
semi-definite (PSD). The current behavior is that NumPy raises a warning if
the matrix is not PSD, and does not even check for symmetry.

I added a symmetry check, which raises a warning when the input is not
symmetric. And added two keyword args which users can use to turn off the
checks/warnings when the matrix is ill formed. So this would only cause
another new warning to be raised in existing code.

This is needed because sometimes the covariance matrix is only *almost*
symmetric or PSD due to roundoff error.

Thoughts?


PR: https://github.com/numpy/numpy/pull/5726
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] PEP8

2013-09-09 Thread Blake Griffith
I think a good solution would to use add a git_hooks directory with a
pre-commit git hook along with an git hook installation script. And a note
should be added to DEV_README.txt suggesting installing the git hooks for
pep8 compatibility. I personally use this as a pre-commit

#!/bin/sh

FILES=$(git diff --cached --name-status | grep -v ^D | awk '$1 $2 { print
$2}' | grep -e .py$)
if [ -n $FILES ]; then
pep8 -r $FILES
fi

which is from here: https://gist.github.com/lentil/810399#comment-303703


On Mon, Sep 9, 2013 at 10:54 AM, Nathaniel Smith n...@pobox.com wrote:

 On Mon, Sep 9, 2013 at 3:29 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:
 
 
 
  On Mon, Sep 9, 2013 at 8:12 AM, Richard Hattersley 
 rhatters...@gmail.com
  wrote:
 
   Something we have done in matplotlib is that we have made PEP8 a part
 of
   the tests.
 
  In Iris and Cartopy we've also done this and it works well. While we
  transition we have an exclusion list (which is gradually getting
 shorter).
  We've had mixed experiences with automatic reformatting, so prefer to
 keep
  the human in the loop.
 
 
  I agree with keeping a human in the loop, the script would be intended to
  get things into the right neighborhood, the submitter would have to
 review
  the changes after. If the script isn't too strict there will be than one
 way
  to do some things and those bits would rely on the good taste of the
 coder.

 So if I understand right, the goal is to have some script that
 developers can run before (or after) submitting a PR, like
   tools/autopep8-my-changes numpy/
 that will fix up their changes, but leave the rest of numpy alone?

 And the proposed mechanism is to come up with a combination of changes
 to the numpy source and an autopep8 configuration such that
   autopep8 --our-config numpy/
 becomes a no-op, and then we can use this as an implementation of
 tools/autopep8-my-changes?

 If that's right then my feeling is that the goal seems worthwhile but
 the approach seems difficult and unlikely to survive for long. As soon
 as someone overrides autopep8 once, we either have to disable the rule
 for the whole project or keep overriding it manually forever. You're
 already suggesting taking out the spaces-around-arithmetic rule, which
 strikes me as one of the most useful -- sure, it gets things wrongs
 sometimes, but I feel like we're constantly reviewing PRs where
 all*the*(arithmetic+is)-written**like*this.

 Maybe a better approach would be to spend that time hacking up some
 script that uses git and autopep8 together to run autopep8 over all
 and only those lines which the current branch has actually touched?
 It's pretty easy to parse 'git diff' output to get a list of all line
 numbers which have been modified, and then we could run autopep8 over
 the modified files and pull out only those changes which touch those
 lines.

 -n

 P.S.: definitely [:, :, 2]
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Upcoming 1.8 release.

2013-08-15 Thread Blake Griffith
I would like to have the ufunc overrides in 1.8 if it is possible.


On Thu, Aug 15, 2013 at 9:21 AM, Charles R Harris charlesr.har...@gmail.com
 wrote:

 I don't see any that *have* to go in, but there are a few that could be
 included. The most significant is probably the inplace fancy indexing if it
 is ready. The nanmean etc. functions are not committed yet, but I think
 they are ready. If the Polynomial import fixes show up, they can go in.
 There are the usual janitorial things,  the release notes need some clean
 up, the docs need merging, and the HOWTO_RELEASE document needs updating.

 For datetime64, I think a comment should be added to the release notes
 that it is still experimental and that changes are expected in 1.9.
 Hopefully the next release will come out next spring.

 I think we are also about ready for a 1.7.2 release.


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Upcoming 1.8 release.

2013-08-15 Thread Blake Griffith
I think it is nearly complete. Although there are some recent changes that
need review.

I still need to go back and make changes to the original NEP noting the
differences in final implementation.


On Thu, Aug 15, 2013 at 11:52 AM, Charles R Harris 
charlesr.har...@gmail.com wrote:


 On Thu, Aug 15, 2013 at 10:48 AM, Blake Griffith 
 blake.a.griff...@gmail.com wrote:

 I would like to have the ufunc overrides in 1.8 if it is possible.


 On Thu, Aug 15, 2013 at 9:21 AM, Charles R Harris 
 charlesr.har...@gmail.com wrote:

 I don't see any that *have* to go in, but there are a few that could be
 included. The most significant is probably the inplace fancy indexing if it
 is ready. The nanmean etc. functions are not committed yet, but I think
 they are ready. If the Polynomial import fixes show up, they can go in.
 There are the usual janitorial things,  the release notes need some clean
 up, the docs need merging, and the HOWTO_RELEASE document needs updating.

 For datetime64, I think a comment should be added to the release notes
 that it is still experimental and that changes are expected in 1.9.
 Hopefully the next release will come out next spring.

 I think we are also about ready for a 1.7.2 release.



 What is the status of that? I've been leaving that commit up the Pauli.

 Chuck

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] ufunc overrides

2013-07-10 Thread Blake Griffith
Hello NumPy,

Part of my GSoC is compatibility with SciPy's sparse matrices and NumPy's
ufuncs. Currently there is no feasible way to do this without changing
ufuncs a bit.

I've been considering a mechanism to override ufuncs based on checking the
ufuncs arguments for a __ufunc_override__ attribute. Then handing off the
operation to a function specified by that attribute. I prototyped this in
python and did a demo in a blog post here:
http://cwl.cx/posts/week-6-ufunc-overrides.html
This is similar to a previously discussed, but never implemented change:
http://mail.scipy.org/pipermail/numpy-discussion/2011-June/056945.html

However it seems like the ufunc machinery might be ripped out and replaced
with a true multi-method implementation soon. See Travis' blog post:
http://technicaldiscovery.blogspot.com/2013/07/thoughts-after-scipy-2013-and-specific.html
So I'd like to make my changes as forward compatible as possible. However
I'm not sure what I should even consider here, or how forward compatible my
current implementation is. Thoughts?

Until then, I'm writing up a nep, it is still pretty incomplete, it can be
found here:

https://github.com/cowlicks/numpy/blob/ufunc-override/doc/neps/ufunc-overrides.rst
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] GSoC proposal -- Numpy SciPy

2013-05-01 Thread Blake Griffith
Oh wow, I just assumed that `dot` was a ufunc... However, it would still be
useful to have ufuncs working well with the sparse package. I don't
understand everything that is going on in
https://github.com/numpy/numpy/blob/master/numpy/core/src/umath/ufunc_object.c

But I assumed that I would be able to add the ability to check for
something like _ufunc_override_. I'm not sure where this piece of logic
should be inserted, or what the performance implications to NumPy would
be... I'm trying to figure this out. But major optimizations to ufuncs is
out of the scope of this GSoC.

I will look into what can be done about the `dot` function.


On Tue, Apr 30, 2013 at 6:53 PM, Nathaniel Smith n...@pobox.com wrote:

 On Tue, Apr 30, 2013 at 4:02 PM, Pauli Virtanen p...@iki.fi wrote:
  30.04.2013 22:37, Nathaniel Smith kirjoitti:
  [clip]
  How do you plan to go about this? The obvious option of just calling
  scipy.sparse.issparse() on ufunc entry raises some problems, since
  numpy can't depend on or even import scipy, and we might be reluctant
  to add such a special case for what's a rather more general problem.
  OTOH it might be possible to solve the problem in general, e.g., see
  the prototyped _ufunc_override_ special method in:
 
https://github.com/njsmith/numpyNEP/blob/master/numpyNEP.py
 
  but I don't know if you want to get into such a debate within the
  scope of your GSoC. What were you thinking?
 
  To me it seems that the right thing to do here is the general solution.
 
  Do you see immediate problems in e.g. just enabling something like your
  _ufunc_override_?

 Just that we might want to think a bit about the design space before
 implementing something. E.g., apparently doing Python attribute lookup
 is very expensive -- we recently had a patch to skip
 __array_interface__ checks whenever possible -- is adding another such
 per-operation overhead ok? I guess we could use similar checks (skip
 checking for known types like int/float/ndarray), or only check for
 _ufunc_override_ on the class (not the instance) and cache the result
 per-class?

  The easy thing is that there are no backward compatibility problems
  here, since if the magic is missing, the old logic is used. Currently,
  the numpy dot() and ufuncs also most of the time do nothing sensible
  with sparse matrix inputs even though they in some cases return values.
  Which then makes writing generic sparse/dense code more painful than
  just __mul__ being matrix multiplication.

 I agree, but, if the main target is 'dot' then the current
 _ufunc_override_ design alone won't do it, since 'dot' is not a
 ufunc...

 -n
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] GSoC proposal -- Numpy SciPy

2013-05-01 Thread Blake Griffith
There are several situations where that comes up (Like comparing two sparse
matrices A == B) There is a SparseEfficiancyWarning that can be thrown, but
the way this should be implemented still needs to be discussed. I will be
writing a specification on how ufuncs and ndarrays are handled by the
sparse package, the spec can be found here
https://github.com/cowlicks/scipy-sparse-ndarray-and-ufunc-spec/blob/master/Spec.markdown.
In general, a unary ufunc operating on a sparse matrix should return a
sparse matrix.

If you really want to do cos(sparse) you will be able to. But if you are
just interested in the initially non zero elements should probably do
something like: sparse.data = np.cos(sparse.data)



On Wed, May 1, 2013 at 1:32 PM, Daπid davidmen...@gmail.com wrote:

 On 1 May 2013 20:12, Blake Griffith blake.a.griff...@gmail.com wrote:
  However, it would still be useful to have ufuncs working well with the
  sparse package.

 How are you planning to deal with ufunc(0) != 0? cos(sparse) is actually
 dense.
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] GSoC proposal -- Numpy SciPy

2013-04-30 Thread Blake Griffith
Hello, I'm writing a GSoC proposal, mostly concerning SciPy, but it
involves a few changes to NumPy.
The proposal is titled: Improvements to the sparse package of Scipy:
support for bool dtype and better interaction with NumPy
and can be found on my GitHub:
https://github.com/cowlicks/GSoC-proposal/blob/master/proposal.markdown#numpy-interactionsjuly-8th-to-august-26th-7-weeks

Basically, I want to change the ufunc class to be aware of SciPy's sparse
matrices. So that when a ufunc is passed a sparse matrix as an argument, it
will dispatch to a function in the sparse matrix package, which will then
decide what to do. I just wanted to ping NumPy to make sure this is
reasonable, and I'm not totally off track. Suggestions, feedback
and criticism welcome.

Thanks!
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion