date:20140210

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Nathaniel Smith

On Sun, Feb 9, 2014 at 4:59 PM, alex argri...@ncsu.edu wrote:
 Hello list,

 I wrote this mini-nep for numpy but I've been advised it is more
 appropriate for discussion on the list.

 
 The ``numpy.matrix`` API provides a low barrier to using Python
 for linear algebra, just as the pre-3 Python ``input`` function
 and ``print`` statement provided low barriers to using Python for
 automatically evaluating input and for printing output.

 On the other hand, it really needs to be deprecated.
 Let's deprecate ``numpy.matrix``.
 

 I understand that numpy.matrix will not be deprecated any time soon,
 but I hope this will register as a vote to help nudge its deprecation
 closer to the realm of acceptable discussion.

To make this more productive, maybe it would be useful to elaborate on
what exactly we should do here.

I can't imagine we'll actually remove 'matrix' from the numpy
namespace at any point in the near future.

I do have the sense that when people choose to use it, they eventually
come to regret this choice. It's a bit buggy and has confusing
behaviours, and due to limitations of numpy's subclassing model, will
probably always be buggy and have confusing behaviours. And it's
marketed as being for new users, who are exactly the kind of users who
aren't sophisticated enough to recognize these dangers.

Maybe there should be a big warning to this effect in the np.matrix docstring?

Maybe using np.matrix should raise a DeprecationWarning?
(DeprecationWarning doesn't have to mean that something will be
disappearing -- e.g. the Python stdlib deprecates stuff all the time,
but never actually removes it. It's just a warning flag that there are
better options available.)

Or what?

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Alan G Isaac

On 2/9/2014 5:55 PM, alex wrote:
 I'm working on the same kinds of problems in scipy development
 (functions involving sparse matrices and abstract linear operators)


And how is numpy's matrix object getting in your way?
Your initial post simply treated the desirability of
deprecation as a given and did not lay out reasons.
A strong reason would be e.g. if the matrix object
is creating a serious maintenance headache.  Eliminating
this should be a big enough gain to offset any lost interest
in numpy from users of Matlab, GAUSS, IDL etc. from the
disappearance of a user-friendly notation.

I accept that a numpy matrix has some warts.  In the past,
I've proposed changes to address these.  E.g.,
https://www.mail-archive.com/numpy-discussion@scipy.org/msg06780.html
However these went nowhere, so effectively the status quo was
defended.  I can live with that.

A bit of the notational advantage of the `matrix` object was undercut
by the addition of the `dot` method to arrays. If `matrix` is deprecated,
I would hope that a matrix-power method would be added.  (One that works
correctly with boolean arrays and has a short name.)  I ideally an inverse
method would be added as well (with a short name).  I think adding the
hermitian transpose as `.H()` already has some support, but I forget its current
status.

Right now, to give a simple example, students can write a simple projection
matrix as `X * (X.T * X).I * X.T` instead of 
`X.dot(la.inv(X.T.dot(X))).dot(X.T)`.
The advantage is obvious and even bigger with more complex expressions.
If we were to get `.I` for matrix inverse of an array (which I expect to be
vociferously resisted) it would be `X.dot(X.T.dot(X).I).dot(X.T)` which
at the moment I'm inclined to see as acceptable for teaching. (Not sure.)

Just to forestall the usual just start them with arrays, eventually they'll
be grateful reply, I would want to hear that suggestion only from someone
who has used it successfully with undergraduates in the social sciences.

Alan Isaac

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Alexander Belopolsky

On Sun, Feb 9, 2014 at 4:59 PM, alex argri...@ncsu.edu wrote:

 On the other hand, it really needs to be deprecated.


While numpy.matrix may have its problems, a NEP should list a better
rationale than the above to gain acceptance.

Personally, I decided not to use numpy.matrix in production code about 10
years ago and never looked back to that decision.  I've heard however that
some of the worst inheritance warts have been fixed over the years.  I also
resisted introducing inheritance  in the implementation of masked arrays,
but I lost that argument.  For better or worse, inheritance from ndarray is
here to stay and I would rather see numpy.matrix stay as a test-bed for
fixing inheritance issues rather than see it deprecated and have the same
issues pop up in ma or elsewhere.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread josef . pktd

On Mon, Feb 10, 2014 at 10:09 AM, Alan G Isaac alan.is...@gmail.com wrote:

 On 2/9/2014 5:55 PM, alex wrote:
  I'm working on the same kinds of problems in scipy development
  (functions involving sparse matrices and abstract linear operators)


 And how is numpy's matrix object getting in your way?
 Your initial post simply treated the desirability of
 deprecation as a given and did not lay out reasons.
 A strong reason would be e.g. if the matrix object
 is creating a serious maintenance headache.  Eliminating
 this should be a big enough gain to offset any lost interest
 in numpy from users of Matlab, GAUSS, IDL etc. from the
 disappearance of a user-friendly notation.

 I accept that a numpy matrix has some warts.  In the past,
 I've proposed changes to address these.  E.g.,
 https://www.mail-archive.com/numpy-discussion@scipy.org/msg06780.html
 However these went nowhere, so effectively the status quo was
 defended.  I can live with that.

 A bit of the notational advantage of the `matrix` object was undercut
 by the addition of the `dot` method to arrays. If `matrix` is deprecated,
 I would hope that a matrix-power method would be added.  (One that works
 correctly with boolean arrays and has a short name.)  I ideally an inverse
 method would be added as well (with a short name).  I think adding the
 hermitian transpose as `.H()` already has some support, but I forget its
 current
 status.

 Right now, to give a simple example, students can write a simple projection
 matrix as `X * (X.T * X).I * X.T` instead of
 `X.dot(la.inv(X.T.dot(X))).dot(X.T)`.


X.dot(la.pinv(X))

or even better assign pinv(X) to a name and reuse it.

Josef
(I never taught statistics or econometrics to undergraduates in Social
Sciences.)
How do we calculate the diagonal of the hat matrix without using N by N
matrices?



 The advantage is obvious and even bigger with more complex expressions.
 If we were to get `.I` for matrix inverse of an array (which I expect to be
 vociferously resisted) it would be `X.dot(X.T.dot(X).I).dot(X.T)` which
 at the moment I'm inclined to see as acceptable for teaching. (Not sure.)

 Just to forestall the usual just start them with arrays, eventually
 they'll
 be grateful reply, I would want to hear that suggestion only from someone
 who has used it successfully with undergraduates in the social sciences.

 Alan Isaac

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Nathaniel Smith

On Mon, Feb 10, 2014 at 11:16 AM, Alexander Belopolsky ndar...@mac.com wrote:

 On Sun, Feb 9, 2014 at 4:59 PM, alex argri...@ncsu.edu wrote:

 On the other hand, it really needs to be deprecated.


 While numpy.matrix may have its problems, a NEP should list a better
 rationale than the above to gain acceptance.

 Personally, I decided not to use numpy.matrix in production code about 10
 years ago and never looked back to that decision.  I've heard however that
 some of the worst inheritance warts have been fixed over the years.  I also
 resisted introducing inheritance  in the implementation of masked arrays,
 but I lost that argument.  For better or worse, inheritance from ndarray is
 here to stay and I would rather see numpy.matrix stay as a test-bed for
 fixing inheritance issues rather than see it deprecated and have the same
 issues pop up in ma or elsewhere.

In practice, the existence of np.matrix doesn't seem to have any
affect on whether inheritance issues get fixed. And in the long run, I
think the goal is to move people away from inheriting from np.ndarray.
Really the only good reason to inherit from np.ndarray right now, is
if there's something you want to do that is impossible without using
inheritance. But we're working on fixing those issues (e.g.,
__numpy_ufunc__ in the next release). And AFAICT most of the remaining
issues with inheritance simply cannot be fixed, because the
requirements are ill-defined and contradictory.

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Dinesh Vadhia

Scipy sparse uses matrices - I was under the impression that scipy sparse only 
works with matrices or have things moved on?

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread alex

On Mon, Feb 10, 2014 at 11:27 AM,  josef.p...@gmail.com wrote:
 How do we calculate the diagonal of the hat matrix without using N by N
 matrices?

Not sure if this was a rhetorical question or what, but this seems to work
leverages = np.square(scipy.linalg.qr(X, mode='economic')[0]).sum(axis=1)
http://www4.ncsu.edu/~ipsen/ps/slides_CSE2013.pdf
Sorry for off-topic...
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Matthieu Brucher

Yes, but these will be scipy.sparse matrices, nothing to do with numpy
(dense) matrices.

Cheers,

Matthieu

2014-02-10 Dinesh Vadhia dineshbvad...@hotmail.com:
 Scipy sparse uses matrices - I was under the impression that scipy sparse
 only works with matrices or have things moved on?



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion




-- 
Information System Engineer, Ph.D.
Blog: http://matt.eifelle.com
LinkedIn: http://www.linkedin.com/in/matthieubrucher
Music band: http://liliejay.com/
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Nathaniel Smith

On Mon, Feb 10, 2014 at 12:02 PM, Matthieu Brucher
matthieu.bruc...@gmail.com wrote:
 Yes, but these will be scipy.sparse matrices, nothing to do with numpy
 (dense) matrices.

Unfortunately when scipy.sparse matrices interact with dense ndarrays
(e.g., sparse matrix * dense vector), then you always get back
np.matrix objects instead of np.ndarray objects. So it's impossible to
avoid np.matrix entirely if using scipy.sparse.

-n
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread alex

On Mon, Feb 10, 2014 at 12:16 PM, Nathaniel Smith n...@pobox.com wrote:
 On Mon, Feb 10, 2014 at 12:02 PM, Matthieu Brucher
 matthieu.bruc...@gmail.com wrote:
 Yes, but these will be scipy.sparse matrices, nothing to do with numpy
 (dense) matrices.

 Unfortunately when scipy.sparse matrices interact with dense ndarrays
 (e.g., sparse matrix * dense vector), then you always get back
 np.matrix objects


 csr_matrix([[1, 2], [3, 4]]) * array([5, 6])
array([17, 39])
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread eat

On Mon, Feb 10, 2014 at 7:00 PM, alex argri...@ncsu.edu wrote:

 On Mon, Feb 10, 2014 at 11:27 AM,  josef.p...@gmail.com wrote:
  How do we calculate the diagonal of the hat matrix without using N by N
  matrices?

 Not sure if this was a rhetorical question or what, but this seems to work
 leverages = np.square(scipy.linalg.qr(X, mode='economic')[0]).sum(axis=1)
 http://www4.ncsu.edu/~ipsen/ps/slides_CSE2013.pdf

Rhetorical or not, but FWIW I'll prefer to take singular value
decomposition (u, s, vt= svd(x)) and then based on the singular values
sI'll estimate a numerically feasible rank
r. Thus the diagonal of such hat matrix would be (u[:, :r]** 2).sum(1).


Regards,
-eat


 Sorry for off-topic...
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread alex

On Mon, Feb 10, 2014 at 2:03 PM, eat e.antero.ta...@gmail.com wrote:
 Rhetorical or not, but FWIW I'll prefer to take singular value decomposition
 (u, s, vt= svd(x)) and then based on the singular values s I'll estimate a
 numerically feasible rank r. Thus the diagonal of such hat matrix would be
 (u[:, :r]** 2).sum(1).

It's a small detail but you probably want svd(x, full_matrices=False)
to avoid anything NxN.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread eat

On Mon, Feb 10, 2014 at 9:08 PM, alex argri...@ncsu.edu wrote:

 On Mon, Feb 10, 2014 at 2:03 PM, eat e.antero.ta...@gmail.com wrote:
  Rhetorical or not, but FWIW I'll prefer to take singular value
 decomposition
  (u, s, vt= svd(x)) and then based on the singular values s I'll estimate
 a
  numerically feasible rank r. Thus the diagonal of such hat matrix
 would be
  (u[:, :r]** 2).sum(1).

 It's a small detail but you probably want svd(x, full_matrices=False)
 to avoid anything NxN.

Indeed.

Thanks,
-eat

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Matthew Brett

Hi,

On Mon, Feb 10, 2014 at 6:26 AM, Nathaniel Smith n...@pobox.com wrote:
 On Sun, Feb 9, 2014 at 4:59 PM, alex argri...@ncsu.edu wrote:
 Hello list,

 I wrote this mini-nep for numpy but I've been advised it is more
 appropriate for discussion on the list.

 
 The ``numpy.matrix`` API provides a low barrier to using Python
 for linear algebra, just as the pre-3 Python ``input`` function
 and ``print`` statement provided low barriers to using Python for
 automatically evaluating input and for printing output.

 On the other hand, it really needs to be deprecated.
 Let's deprecate ``numpy.matrix``.
 

 I understand that numpy.matrix will not be deprecated any time soon,
 but I hope this will register as a vote to help nudge its deprecation
 closer to the realm of acceptable discussion.

 To make this more productive, maybe it would be useful to elaborate on
 what exactly we should do here.

 I can't imagine we'll actually remove 'matrix' from the numpy
 namespace at any point in the near future.
{out of order paste}:
 Maybe there should be a big warning to this effect in the np.matrix docstring?

That seems reasonable to me.  Maybe, to avoid heat and fast changes
the NEP could lay out different options with advantages and
disadvantages.

 I do have the sense that when people choose to use it, they eventually
 come to regret this choice. It's a bit buggy and has confusing
 behaviours, and due to limitations of numpy's subclassing model, will
 probably always be buggy and have confusing behaviours. And it's
 marketed as being for new users, who are exactly the kind of users who
 aren't sophisticated enough to recognize these dangers.

This paragraph is a good summary of why the current situation of
np.matrix could cause harm.

It would really useful to have some hard evidence of who's using it
though.  Are there projects that use np.matrix extensively?  If so,
maybe some code from these could be use-cases to see if (pseudo-)
deprecation is practical?

Alex - do you have time to lay this stuff out?  I bet the NEP would be
a good way of helping the discussion stays on track.  At very least it
could be a reference point the next time this comes up.

Thanks for bringing this up,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread josef . pktd

On Mon, Feb 10, 2014 at 2:12 PM, eat e.antero.ta...@gmail.com wrote:




 On Mon, Feb 10, 2014 at 9:08 PM, alex argri...@ncsu.edu wrote:

 On Mon, Feb 10, 2014 at 2:03 PM, eat e.antero.ta...@gmail.com wrote:
  Rhetorical or not, but FWIW I'll prefer to take singular value
 decomposition
  (u, s, vt= svd(x)) and then based on the singular values s I'll
 estimate a
  numerically feasible rank r. Thus the diagonal of such hat matrix
 would be
  (u[:, :r]** 2).sum(1).

 It's a small detail but you probably want svd(x, full_matrices=False)
 to avoid anything NxN.

 Indeed.


I meant the entire diagonal not the trace of the projection matrix.

My (not articulated) thought was that I use element wise multiplication
together with dot products instead of the three dot products, however
elementwise algebra is not very common in linear algebra based textbooks.

The question is whether students and new user coming from `matrix`
languages can translate formulas into code, or just copy formulas to code.
(It took me a while to get used to numpy and take advantage of it's
features coming from GAUSS and Matlab.)

OT since the precense or absence of matrix in numpy doesn't affect me.

Josef



 Thanks,
 -eat

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Matthew Brett

Hi,

On Mon, Feb 10, 2014 at 11:44 AM,  josef.p...@gmail.com wrote:


 On Mon, Feb 10, 2014 at 2:12 PM, eat e.antero.ta...@gmail.com wrote:




 On Mon, Feb 10, 2014 at 9:08 PM, alex argri...@ncsu.edu wrote:

 On Mon, Feb 10, 2014 at 2:03 PM, eat e.antero.ta...@gmail.com wrote:
  Rhetorical or not, but FWIW I'll prefer to take singular value
  decomposition
  (u, s, vt= svd(x)) and then based on the singular values s I'll
  estimate a
  numerically feasible rank r. Thus the diagonal of such hat matrix
  would be
  (u[:, :r]** 2).sum(1).

 It's a small detail but you probably want svd(x, full_matrices=False)
 to avoid anything NxN.

 Indeed.


 I meant the entire diagonal not the trace of the projection matrix.

 My (not articulated) thought was that I use element wise multiplication
 together with dot products instead of the three dot products, however
 elementwise algebra is not very common in linear algebra based textbooks.

 The question is whether students and new user coming from `matrix` languages
 can translate formulas into code, or just copy formulas to code.
 (It took me a while to get used to numpy and take advantage of it's features
 coming from GAUSS and Matlab.)

 OT since the precense or absence of matrix in numpy doesn't affect me.

Josef - as a data point - does statsmodels use np.matrix?

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread josef . pktd

On Mon, Feb 10, 2014 at 10:09 AM, Alan G Isaac alan.is...@gmail.com wrote:

 On 2/9/2014 5:55 PM, alex wrote:
  I'm working on the same kinds of problems in scipy development
  (functions involving sparse matrices and abstract linear operators)


 And how is numpy's matrix object getting in your way?
 Your initial post simply treated the desirability of
 deprecation as a given and did not lay out reasons.
 A strong reason would be e.g. if the matrix object
 is creating a serious maintenance headache.  Eliminating
 this should be a big enough gain to offset any lost interest
 in numpy from users of Matlab, GAUSS, IDL etc. from the
 disappearance of a user-friendly notation.

 I accept that a numpy matrix has some warts.  In the past,
 I've proposed changes to address these.  E.g.,
 https://www.mail-archive.com/numpy-discussion@scipy.org/msg06780.html
 However these went nowhere, so effectively the status quo was
 defended.  I can live with that.

 A bit of the notational advantage of the `matrix` object was undercut
 by the addition of the `dot` method to arrays.


just another one that make arrays nicer (although I'm on old versions and
don't use it yet):

keepdims option for reduce operations, like mean.
demean each row ?

Josef


 If `matrix` is deprecated,
 I would hope that a matrix-power method would be added.  (One that works
 correctly with boolean arrays and has a short name.)  I ideally an inverse
 method would be added as well (with a short name).  I think adding the
 hermitian transpose as `.H()` already has some support, but I forget its
 current
 status.

 Right now, to give a simple example, students can write a simple projection
 matrix as `X * (X.T * X).I * X.T` instead of
 `X.dot(la.inv(X.T.dot(X))).dot(X.T)`.
 The advantage is obvious and even bigger with more complex expressions.
 If we were to get `.I` for matrix inverse of an array (which I expect to be
 vociferously resisted) it would be `X.dot(X.T.dot(X).I).dot(X.T)` which
 at the moment I'm inclined to see as acceptable for teaching. (Not sure.)

 Just to forestall the usual just start them with arrays, eventually
 they'll
 be grateful reply, I would want to hear that suggestion only from someone
 who has used it successfully with undergraduates in the social sciences.

 Alan Isaac

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Skipper Seabold

On Mon, Feb 10, 2014 at 2:49 PM, Matthew Brett matthew.br...@gmail.comwrote:

 Hi,

 On Mon, Feb 10, 2014 at 11:44 AM,  josef.p...@gmail.com wrote:
 
 
  On Mon, Feb 10, 2014 at 2:12 PM, eat e.antero.ta...@gmail.com wrote:
 
 
 
 
  On Mon, Feb 10, 2014 at 9:08 PM, alex argri...@ncsu.edu wrote:
 
  On Mon, Feb 10, 2014 at 2:03 PM, eat e.antero.ta...@gmail.com wrote:
   Rhetorical or not, but FWIW I'll prefer to take singular value
   decomposition
   (u, s, vt= svd(x)) and then based on the singular values s I'll
   estimate a
   numerically feasible rank r. Thus the diagonal of such hat matrix
   would be
   (u[:, :r]** 2).sum(1).
 
  It's a small detail but you probably want svd(x, full_matrices=False)
  to avoid anything NxN.
 
  Indeed.
 
 
  I meant the entire diagonal not the trace of the projection matrix.
 
  My (not articulated) thought was that I use element wise multiplication
  together with dot products instead of the three dot products, however
  elementwise algebra is not very common in linear algebra based textbooks.
 
  The question is whether students and new user coming from `matrix`
 languages
  can translate formulas into code, or just copy formulas to code.
  (It took me a while to get used to numpy and take advantage of it's
 features
  coming from GAUSS and Matlab.)
 
  OT since the precense or absence of matrix in numpy doesn't affect me.

 Josef - as a data point - does statsmodels use np.matrix?


No.

Skipper
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread josef . pktd

On Mon, Feb 10, 2014 at 2:49 PM, Matthew Brett matthew.br...@gmail.comwrote:

 Hi,

 On Mon, Feb 10, 2014 at 11:44 AM,  josef.p...@gmail.com wrote:
 
 
  On Mon, Feb 10, 2014 at 2:12 PM, eat e.antero.ta...@gmail.com wrote:
 
 
 
 
  On Mon, Feb 10, 2014 at 9:08 PM, alex argri...@ncsu.edu wrote:
 
  On Mon, Feb 10, 2014 at 2:03 PM, eat e.antero.ta...@gmail.com wrote:
   Rhetorical or not, but FWIW I'll prefer to take singular value
   decomposition
   (u, s, vt= svd(x)) and then based on the singular values s I'll
   estimate a
   numerically feasible rank r. Thus the diagonal of such hat matrix
   would be
   (u[:, :r]** 2).sum(1).
 
  It's a small detail but you probably want svd(x, full_matrices=False)
  to avoid anything NxN.
 
  Indeed.
 
 
  I meant the entire diagonal not the trace of the projection matrix.
 
  My (not articulated) thought was that I use element wise multiplication
  together with dot products instead of the three dot products, however
  elementwise algebra is not very common in linear algebra based textbooks.
 
  The question is whether students and new user coming from `matrix`
 languages
  can translate formulas into code, or just copy formulas to code.
  (It took me a while to get used to numpy and take advantage of it's
 features
  coming from GAUSS and Matlab.)
 
  OT since the precense or absence of matrix in numpy doesn't affect me.

 Josef - as a data point - does statsmodels use np.matrix?


No (*). It's too much work to pay attention to whether something is an
array or a matrix.
although, we have a few sparse matrices.
and pandas.DataFrames have a few tricky corners in between array and matrix.

(*) grep findes two cases of `np.matrix` in the sandbox. (old unused code)

Josef



 Cheers,

 Matthew
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Matthew Brett

Hi,

On Mon, Feb 10, 2014 at 7:09 AM, Alan G Isaac alan.is...@gmail.com wrote:
[snip]
 Just to forestall the usual just start them with arrays, eventually they'll
 be grateful reply, I would want to hear that suggestion only from someone
 who has used it successfully with undergraduates in the social sciences.

I teach psychologists and neuroscientists mainly - you can get an idea
of the level I'm teaching at from the notebook I posted earlier in the
thread.

I can't speak to my success in any objective way, but I didn't hear
the students complain about the X.dot(Y).  This may be because

a) only some of them have much experience of or liking for matlab
b) some of them have the impression that Python is the way to go, and
they accept that this will mean some changes
c) not much of the code they see is of the form: X * (X.T * X).I * X.T
.  In fact, the notebook I posted was the closest to that stuff.  In
any  case I personally found it easier show the ideas using sympy.

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Alan G Isaac

On 2/10/2014 3:04 PM, Matthew Brett wrote:
 I teach psychologists and neuroscientists mainly


I must suspect that notebook was not for
**undergraduate** psychology students.
At least, not the ones I usually meet.

SymPy is great but for those without background
it is at best awkward.  It certainly does not
offer an equivalent to the notational convenience
of numpy's matrix object.


As far as I have been able to discern, the underlying
motivation for eliminating the matrix class is that
some developers want to stop supporting in any form
the subclassing of numpy arrays.  Do I have that right?

So the real question is not about numpy's matrix class,
but about whether subclassing will be supported.
(If I'm correctly reading the tea leaves.)

Alan Isaac

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Matthew Brett

Hi,

On Mon, Feb 10, 2014 at 12:23 PM, Alan G Isaac alan.is...@gmail.com wrote:
 On 2/10/2014 3:04 PM, Matthew Brett wrote:
 I teach psychologists and neuroscientists mainly

 I must suspect that notebook was not for
 **undergraduate** psychology students.
 At least, not the ones I usually meet.

Well - in this case a mix.  The class was this one:

practical-neuroimaging.github.com

I realize I'm not sure what you are teaching that is less complicated
than the notebook, but nevertheless has a reasonable amount of stuff
like X * (X.T * X).I * X.T ?  Have you got any teaching materials to
hand that would help us understand what you mean?

 SymPy is great but for those without background
 it is at best awkward.  It certainly does not
 offer an equivalent to the notational convenience
 of numpy's matrix object.


 As far as I have been able to discern, the underlying
 motivation for eliminating the matrix class is that
 some developers want to stop supporting in any form
 the subclassing of numpy arrays.  Do I have that right?

No I don't think so, and I believe that line would be distracting.

The question as I understand it, is very directly about whether the
benefit of the notational convenience of np.matrix might be outweighed
by the later cost of switching to np.array, and the confusion that
comes up when a new user has to choose between them.  But - I guess
this will be stuff that has to go into the NEP.

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread josef . pktd

On Mon, Feb 10, 2014 at 3:04 PM, Matthew Brett matthew.br...@gmail.comwrote:

 Hi,

 On Mon, Feb 10, 2014 at 7:09 AM, Alan G Isaac alan.is...@gmail.com
 wrote:
 [snip]
  Just to forestall the usual just start them with arrays, eventually
 they'll
  be grateful reply, I would want to hear that suggestion only from
 someone
  who has used it successfully with undergraduates in the social sciences.

 I teach psychologists and neuroscientists mainly - you can get an idea
 of the level I'm teaching at from the notebook I posted earlier in the
 thread.

 I can't speak to my success in any objective way, but I didn't hear
 the students complain about the X.dot(Y).  This may be because

 a) only some of them have much experience of or liking for matlab
 b) some of them have the impression that Python is the way to go, and
 they accept that this will mean some changes
 c) not much of the code they see is of the form: X * (X.T * X).I * X.T
 .  In fact, the notebook I posted was the closest to that stuff.  In
 any  case I personally found it easier show the ideas using sympy.


In support of Alan's view:

Linear models in econometrics is all linear algebra, and GAUSS is still
popular among econometricians because you can write a lot of code just like
in the paper. (although GAUSS isn't as popular as it was some time ago, but
matlab is not much different.)

https://github.com/statsmodels/statsmodels/blob/master/statsmodels/sandbox/regression/gmm.py#L1194

statsmodels doesn't use masked arrays; structured dtypes and recarrays are
only used for input, and might be replaced by pandas.DataFrames,  pandas is
creeping into more core areas of statsmodels.

I'm not voting in favor of removing everything in numpy that I'm not using.

Josef



 Cheers,

 Matthew
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Charles R Harris

On Mon, Feb 10, 2014 at 1:23 PM, Alan G Isaac alan.is...@gmail.com wrote:

 On 2/10/2014 3:04 PM, Matthew Brett wrote:
  I teach psychologists and neuroscientists mainly


 I must suspect that notebook was not for
 **undergraduate** psychology students.
 At least, not the ones I usually meet.

 SymPy is great but for those without background
 it is at best awkward.  It certainly does not
 offer an equivalent to the notational convenience
 of numpy's matrix object.


 As far as I have been able to discern, the underlying
 motivation for eliminating the matrix class is that
 some developers want to stop supporting in any form
 the subclassing of numpy arrays.  Do I have that right?

 So the real question is not about numpy's matrix class,
 but about whether subclassing will be supported.
 (If I'm correctly reading the tea leaves.)


I don't see any reason to remove the Matrix object. It has its limitations,
I don't use it myself, but it costs little and I don't see the value of
forcing users to change.

As to subclassing ndarray, it is not recommended because it seldom saves
much work (see masked arrays), and can have side effects that are difficult
to deal with. The result of the latter is that numpy itself is called upon
to support array method overrides, of sum and mean for example. That makes
for a mess. That said, there is no movement to forbid subclassing ndarray,
but there will probably be more resistance to accommodating and fixing
problems arising from that design choice. At least that is my own feeling.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread alex

On Mon, Feb 10, 2014 at 2:36 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Mon, Feb 10, 2014 at 6:26 AM, Nathaniel Smith n...@pobox.com wrote:
 On Sun, Feb 9, 2014 at 4:59 PM, alex argri...@ncsu.edu wrote:
 Hello list,

 I wrote this mini-nep for numpy but I've been advised it is more
 appropriate for discussion on the list.

 
 The ``numpy.matrix`` API provides a low barrier to using Python
 for linear algebra, just as the pre-3 Python ``input`` function
 and ``print`` statement provided low barriers to using Python for
 automatically evaluating input and for printing output.

 On the other hand, it really needs to be deprecated.
 Let's deprecate ``numpy.matrix``.
 

 I understand that numpy.matrix will not be deprecated any time soon,
 but I hope this will register as a vote to help nudge its deprecation
 closer to the realm of acceptable discussion.

 To make this more productive, maybe it would be useful to elaborate on
 what exactly we should do here.

 I can't imagine we'll actually remove 'matrix' from the numpy
 namespace at any point in the near future.
 {out of order paste}:
 Maybe there should be a big warning to this effect in the np.matrix 
 docstring?

 That seems reasonable to me.  Maybe, to avoid heat and fast changes
 the NEP could lay out different options with advantages and
 disadvantages.

 I do have the sense that when people choose to use it, they eventually
 come to regret this choice. It's a bit buggy and has confusing
 behaviours, and due to limitations of numpy's subclassing model, will
 probably always be buggy and have confusing behaviours. And it's
 marketed as being for new users, who are exactly the kind of users who
 aren't sophisticated enough to recognize these dangers.

 This paragraph is a good summary of why the current situation of
 np.matrix could cause harm.

 It would really useful to have some hard evidence of who's using it
 though.  Are there projects that use np.matrix extensively?  If so,
 maybe some code from these could be use-cases to see if (pseudo-)
 deprecation is practical?

 Alex - do you have time to lay this stuff out?  I bet the NEP would be
 a good way of helping the discussion stays on track.  At very least it
 could be a reference point the next time this comes up.

I don't think I have enough perspective to write a real NEP, but maybe
as a starting point we could begin a list somewhere, like on a wiki or
possibly in the numpy github repo, surveying an early 2014 snapshot of
the linear algebra APIs used by various Python projects.  For example
according to the responses in this thread, statsmodels seems to avoid
using numpy.matrix except possibly for interfacing with pandas, and at
least one professor relies on the numpy.matrix interface for classroom
teaching.  The list could include short quotes from people involved in
the projects, if they want to share an opinion.

It wouldn't be my intention to treat such a list as a vote, but rather
as data and as an excuse to make a list of cool projects; I suspect
that members of most projects would say we don't use numpy.matrix but
we don't mind if other people use it and that most teachers or
students who benefit from the gentler syntax of numpy.matrix would not
even be reached by such a survey.

Alex
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Matthew Brett

Hi,

On Mon, Feb 10, 2014 at 12:44 PM, Charles R Harris
charlesr.har...@gmail.com wrote:



 On Mon, Feb 10, 2014 at 1:23 PM, Alan G Isaac alan.is...@gmail.com wrote:

 On 2/10/2014 3:04 PM, Matthew Brett wrote:
  I teach psychologists and neuroscientists mainly


 I must suspect that notebook was not for
 **undergraduate** psychology students.
 At least, not the ones I usually meet.

 SymPy is great but for those without background
 it is at best awkward.  It certainly does not
 offer an equivalent to the notational convenience
 of numpy's matrix object.


 As far as I have been able to discern, the underlying
 motivation for eliminating the matrix class is that
 some developers want to stop supporting in any form
 the subclassing of numpy arrays.  Do I have that right?

 So the real question is not about numpy's matrix class,
 but about whether subclassing will be supported.
 (If I'm correctly reading the tea leaves.)


 I don't see any reason to remove the Matrix object. It has its limitations,
 I don't use it myself, but it costs little and I don't see the value of
 forcing users to change.

Maybe it would help to take 'remove the Matrix object' off the table
so we don't get side-tracked.  Does anyone disagree with the proposal
to take that off the table?

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Matthew Brett

Hi,

On Mon, Feb 10, 2014 at 12:39 PM,  josef.p...@gmail.com wrote:


 On Mon, Feb 10, 2014 at 3:04 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Mon, Feb 10, 2014 at 7:09 AM, Alan G Isaac alan.is...@gmail.com
 wrote:
 [snip]
  Just to forestall the usual just start them with arrays, eventually
  they'll
  be grateful reply, I would want to hear that suggestion only from
  someone
  who has used it successfully with undergraduates in the social sciences.

 I teach psychologists and neuroscientists mainly - you can get an idea
 of the level I'm teaching at from the notebook I posted earlier in the
 thread.

 I can't speak to my success in any objective way, but I didn't hear
 the students complain about the X.dot(Y).  This may be because

 a) only some of them have much experience of or liking for matlab
 b) some of them have the impression that Python is the way to go, and
 they accept that this will mean some changes
 c) not much of the code they see is of the form: X * (X.T * X).I * X.T
 .  In fact, the notebook I posted was the closest to that stuff.  In
 any  case I personally found it easier show the ideas using sympy.


 In support of Alan's view:

 Linear models in econometrics is all linear algebra, and GAUSS is still
 popular among econometricians because you can write a lot of code just like
 in the paper. (although GAUSS isn't as popular as it was some time ago, but
 matlab is not much different.)

 https://github.com/statsmodels/statsmodels/blob/master/statsmodels/sandbox/regression/gmm.py#L1194

Maybe it would be helpful to draw the distinction between

1) Teaching people to do numerical coding
2) Using code to demonstrate mathematical concepts

For 1) - it looks like people writing serious code don't generally use
np.matrix - but maybe we're missing some code-bases.
For 2) - I personally think sympy is better for this.

There might be some middle-ground (1.5) where the idea is to get
people comfortable with writing 10-50 line scripts to do linear
algebra-type things.   I guess these people will be particularly
difficult to persuade that it's a good idea to switch computer
languages.

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread josef . pktd

On Mon, Feb 10, 2014 at 3:45 PM, alex argri...@ncsu.edu wrote:

 On Mon, Feb 10, 2014 at 2:36 PM, Matthew Brett matthew.br...@gmail.com
 wrote:
  Hi,
 
  On Mon, Feb 10, 2014 at 6:26 AM, Nathaniel Smith n...@pobox.com wrote:
  On Sun, Feb 9, 2014 at 4:59 PM, alex argri...@ncsu.edu wrote:
  Hello list,
 
  I wrote this mini-nep for numpy but I've been advised it is more
  appropriate for discussion on the list.
 
  
  The ``numpy.matrix`` API provides a low barrier to using Python
  for linear algebra, just as the pre-3 Python ``input`` function
  and ``print`` statement provided low barriers to using Python for
  automatically evaluating input and for printing output.
 
  On the other hand, it really needs to be deprecated.
  Let's deprecate ``numpy.matrix``.
  
 
  I understand that numpy.matrix will not be deprecated any time soon,
  but I hope this will register as a vote to help nudge its deprecation
  closer to the realm of acceptable discussion.
 
  To make this more productive, maybe it would be useful to elaborate on
  what exactly we should do here.
 
  I can't imagine we'll actually remove 'matrix' from the numpy
  namespace at any point in the near future.
  {out of order paste}:
  Maybe there should be a big warning to this effect in the np.matrix
 docstring?
 
  That seems reasonable to me.  Maybe, to avoid heat and fast changes
  the NEP could lay out different options with advantages and
  disadvantages.
 
  I do have the sense that when people choose to use it, they eventually
  come to regret this choice. It's a bit buggy and has confusing
  behaviours, and due to limitations of numpy's subclassing model, will
  probably always be buggy and have confusing behaviours. And it's
  marketed as being for new users, who are exactly the kind of users who
  aren't sophisticated enough to recognize these dangers.
 
  This paragraph is a good summary of why the current situation of
  np.matrix could cause harm.
 
  It would really useful to have some hard evidence of who's using it
  though.  Are there projects that use np.matrix extensively?  If so,
  maybe some code from these could be use-cases to see if (pseudo-)
  deprecation is practical?
 
  Alex - do you have time to lay this stuff out?  I bet the NEP would be
  a good way of helping the discussion stays on track.  At very least it
  could be a reference point the next time this comes up.

 I don't think I have enough perspective to write a real NEP, but maybe
 as a starting point we could begin a list somewhere, like on a wiki or
 possibly in the numpy github repo, surveying an early 2014 snapshot of
 the linear algebra APIs used by various Python projects.  For example
 according to the responses in this thread, statsmodels seems to avoid
 using numpy.matrix except possibly for interfacing with pandas, and at
 least one professor relies on the numpy.matrix interface for classroom
 teaching.  The list could include short quotes from people involved in
 the projects, if they want to share an opinion.

 It wouldn't be my intention to treat such a list as a vote, but rather
 as data and as an excuse to make a list of cool projects; I suspect
 that members of most projects would say we don't use numpy.matrix but
 we don't mind if other people use it and that most teachers or
 students who benefit from the gentler syntax of numpy.matrix would not
 even be reached by such a survey.


My impression:

As long as there is no big maintenance cost (which there isn't), I don't
see any reason to remove matrix and to debate it every few years.
All the users that are participating or reading the mailing list have been
indoctrinated for years not to use matrix.

Alan is one of the only active proponents.

What about the silent hundred thousand users of numpy?  I have no idea what
they are doing.


stage 1 use loops
stage 2 use matrix
stage 3 use arrays

stage 1 and stage 2 is how undergraduate econometrics starts out.

Josef
Let's remove loops, users should vectorize.




 Alex
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread josef . pktd

On Mon, Feb 10, 2014 at 3:39 PM, josef.p...@gmail.com wrote:



 On Mon, Feb 10, 2014 at 3:04 PM, Matthew Brett matthew.br...@gmail.comwrote:

 Hi,

 On Mon, Feb 10, 2014 at 7:09 AM, Alan G Isaac alan.is...@gmail.com
 wrote:
 [snip]
  Just to forestall the usual just start them with arrays, eventually
 they'll
  be grateful reply, I would want to hear that suggestion only from
 someone
  who has used it successfully with undergraduates in the social sciences.

 I teach psychologists and neuroscientists mainly - you can get an idea
 of the level I'm teaching at from the notebook I posted earlier in the
 thread.

 I can't speak to my success in any objective way, but I didn't hear
 the students complain about the X.dot(Y).  This may be because

 a) only some of them have much experience of or liking for matlab
 b) some of them have the impression that Python is the way to go, and
 they accept that this will mean some changes
 c) not much of the code they see is of the form: X * (X.T * X).I * X.T
 .  In fact, the notebook I posted was the closest to that stuff.  In
 any  case I personally found it easier show the ideas using sympy.


 In support of Alan's view:

 Linear models in econometrics is all linear algebra, and GAUSS is still
 popular among econometricians because you can write a lot of code just like
 in the paper. (although GAUSS isn't as popular as it was some time ago, but
 matlab is not much different.)





 https://github.com/statsmodels/statsmodels/blob/master/statsmodels/sandbox/regression/gmm.py#L1194


I should have added this
http://en.wikipedia.org/wiki/Generalized_method_of_moments#Asymptotic_normality
covariance of the estimator, second equation line

Josef




 statsmodels doesn't use masked arrays; structured dtypes and recarrays are
 only used for input, and might be replaced by pandas.DataFrames,  pandas is
 creeping into more core areas of statsmodels.

 I'm not voting in favor of removing everything in numpy that I'm not using.

 Josef



 Cheers,

 Matthew
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Pauli Virtanen

10.02.2014 22:23, Alan G Isaac kirjoitti:
[clip]
 As far as I have been able to discern, the underlying
 motivation for eliminating the matrix class is that
 some developers want to stop supporting in any form
 the subclassing of numpy arrays.  Do I have that right?

What sparked this discussion (on Github) is that it is not possible to
write duck-typed code that works correctly for:

- ndarrays
- matrices
- scipy.sparse sparse matrixes

The semantics of all three are different; scipy.sparse is somewhere
between matrices and ndarrays with some things working randomly like
matrices and others not.

With some hyberbole added, one could say that from the developer point
of view, np.matrix is doing and has already done evil just by existing,
by messing up the unstated rules of ndarray semantics in Python.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Matthew Brett

Hi,

On Mon, Feb 10, 2014 at 12:58 PM,  josef.p...@gmail.com wrote:


 On Mon, Feb 10, 2014 at 3:45 PM, alex argri...@ncsu.edu wrote:

 On Mon, Feb 10, 2014 at 2:36 PM, Matthew Brett matthew.br...@gmail.com
 wrote:
  Hi,
 
  On Mon, Feb 10, 2014 at 6:26 AM, Nathaniel Smith n...@pobox.com wrote:
  On Sun, Feb 9, 2014 at 4:59 PM, alex argri...@ncsu.edu wrote:
  Hello list,
 
  I wrote this mini-nep for numpy but I've been advised it is more
  appropriate for discussion on the list.
 
  
  The ``numpy.matrix`` API provides a low barrier to using Python
  for linear algebra, just as the pre-3 Python ``input`` function
  and ``print`` statement provided low barriers to using Python for
  automatically evaluating input and for printing output.
 
  On the other hand, it really needs to be deprecated.
  Let's deprecate ``numpy.matrix``.
  
 
  I understand that numpy.matrix will not be deprecated any time soon,
  but I hope this will register as a vote to help nudge its deprecation
  closer to the realm of acceptable discussion.
 
  To make this more productive, maybe it would be useful to elaborate on
  what exactly we should do here.
 
  I can't imagine we'll actually remove 'matrix' from the numpy
  namespace at any point in the near future.
  {out of order paste}:
  Maybe there should be a big warning to this effect in the np.matrix
  docstring?
 
  That seems reasonable to me.  Maybe, to avoid heat and fast changes
  the NEP could lay out different options with advantages and
  disadvantages.
 
  I do have the sense that when people choose to use it, they eventually
  come to regret this choice. It's a bit buggy and has confusing
  behaviours, and due to limitations of numpy's subclassing model, will
  probably always be buggy and have confusing behaviours. And it's
  marketed as being for new users, who are exactly the kind of users who
  aren't sophisticated enough to recognize these dangers.
 
  This paragraph is a good summary of why the current situation of
  np.matrix could cause harm.
 
  It would really useful to have some hard evidence of who's using it
  though.  Are there projects that use np.matrix extensively?  If so,
  maybe some code from these could be use-cases to see if (pseudo-)
  deprecation is practical?
 
  Alex - do you have time to lay this stuff out?  I bet the NEP would be
  a good way of helping the discussion stays on track.  At very least it
  could be a reference point the next time this comes up.

 I don't think I have enough perspective to write a real NEP, but maybe
 as a starting point we could begin a list somewhere, like on a wiki or
 possibly in the numpy github repo, surveying an early 2014 snapshot of
 the linear algebra APIs used by various Python projects.  For example
 according to the responses in this thread, statsmodels seems to avoid
 using numpy.matrix except possibly for interfacing with pandas, and at
 least one professor relies on the numpy.matrix interface for classroom
 teaching.  The list could include short quotes from people involved in
 the projects, if they want to share an opinion.

 It wouldn't be my intention to treat such a list as a vote, but rather
 as data and as an excuse to make a list of cool projects; I suspect
 that members of most projects would say we don't use numpy.matrix but
 we don't mind if other people use it and that most teachers or
 students who benefit from the gentler syntax of numpy.matrix would not
 even be reached by such a survey.


 My impression:

 As long as there is no big maintenance cost (which there isn't), I don't see
 any reason to remove matrix and to debate it every few years.

No, all agree I think - let's not remove it.

 All the users that are participating or reading the mailing list have been
 indoctrinated for years not to use matrix.

Here is the rub.  This discussion does come up - 'np.array or
np.matrix'.  It came up in a Software Carpentry boot camp I was
teaching on - and the instructors disagreed.

I think the active questions here are:

* Should we collect the discussion in coherent form somewhere?
* Should we add something to the np.matrix docstring and if so what?
* (Pauli's point): to what extent should we try to emulate the np.matrix API.

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Alan G Isaac

On 2/10/2014 4:03 PM, Pauli Virtanen wrote:
 What sparked this discussion (on Github) is that it is not possible to
 write duck-typed code that works correctly for:

Do you mean one must start out with an 'asarray'?
Or more than that?

As I detailed in past discussion, the one thing
I really do not like about the `matrix` design
is that indexing always returns a matrix.
I speculate this is the primary problem you're running into?

Thanks,
Alan Isaac

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Alan G Isaac

On 2/10/2014 4:08 PM, Matthew Brett wrote:
 I think the active questions here are:
 * Should we collect the discussion in coherent form somewhere?
 * Should we add something to the np.matrix docstring and if so what?
 * (Pauli's point): to what extent should we try to emulate the np.matrix API.


Somewhat related to that last point:
could an array grow an `inv` method?
(Perhaps returning a pinv for ill conditioned cases.)

Here are the primary things that make matrices convenient
(particular in a teaching setting):

*   (partly addressed when `dot` method added)
**  (could be partly addressed with an `mpow` method)
.I  (could be partly addressed with an `inv` method)
.H  (currently too controversial for ndarray)

Some might also add the behavior of indexing,
but I could only give qualified agreement to that.

Alan Isaac
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Pauli Virtanen

10.02.2014 23:13, Alan G Isaac kirjoitti:
 On 2/10/2014 4:03 PM, Pauli Virtanen wrote:
 What sparked this discussion (on Github) is that it is not
 possible to write duck-typed code that works correctly for:
 
 Do you mean one must start out with an 'asarray'? Or more than
 that?

Starting with asarray won't work: sparse matrices are not subclasses
of ndarray. Matrix-free linear operators are not such either.

In Python code, you usually very seldom coerce your inputs to a
specific type. The situation here is a bit as if there were two
different stream object types in Python, and their .write() methods
did completely different things, so that code doing I/O would need to
always be careful with which type of a stream was in question.

 As I detailed in past discussion, the one thing I really do not
 like about the `matrix` design is that indexing always returns a
 matrix. I speculate this is the primary problem you're running
 into?

The fact that reductions to 1D return 2D objects is also a problem,
but the matrix multiplication vs. elementwise multiplication and
division is also an issue.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Alan G Isaac

On 2/10/2014 4:28 PM, Pauli Virtanen wrote:
 Starting with asarray won't work: sparse matrices are not subclasses
 of ndarray.


I was focused on the `matrix` object.
For this object, an initial asarray is all it takes to use array code.
(Or ... not?)  And it is a view, not a copy.

I don't have the background to know how scipy ended up with
a sparse matrix object instead of a sparse array object.
In any case, it seems like a different question.

Alan Isaac

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread alex

On Mon, Feb 10, 2014 at 3:47 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Mon, Feb 10, 2014 at 12:44 PM, Charles R Harris
 charlesr.har...@gmail.com wrote:



 On Mon, Feb 10, 2014 at 1:23 PM, Alan G Isaac alan.is...@gmail.com wrote:

 On 2/10/2014 3:04 PM, Matthew Brett wrote:
  I teach psychologists and neuroscientists mainly


 I must suspect that notebook was not for
 **undergraduate** psychology students.
 At least, not the ones I usually meet.

 SymPy is great but for those without background
 it is at best awkward.  It certainly does not
 offer an equivalent to the notational convenience
 of numpy's matrix object.


 As far as I have been able to discern, the underlying
 motivation for eliminating the matrix class is that
 some developers want to stop supporting in any form
 the subclassing of numpy arrays.  Do I have that right?

 So the real question is not about numpy's matrix class,
 but about whether subclassing will be supported.
 (If I'm correctly reading the tea leaves.)


 I don't see any reason to remove the Matrix object. It has its limitations,
 I don't use it myself, but it costs little and I don't see the value of
 forcing users to change.

 Maybe it would help to take 'remove the Matrix object' off the table
 so we don't get side-tracked.  Does anyone disagree with the proposal
 to take that off the table?

No I really want to remove it :)  If a non-frivolous NEP is written,
this can be a token extreme opinion to be immediately discounted as
not a practical solution.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Alan G Isaac

On 2/10/2014 4:40 PM, alex wrote:
 I really want to remove it


Can you articulate the problem created by its existence
that leads you to this view?

Alan Isaac

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Charles R Harris

On Mon, Feb 10, 2014 at 2:28 PM, Pauli Virtanen p...@iki.fi wrote:

 10.02.2014 23:13, Alan G Isaac kirjoitti:
  On 2/10/2014 4:03 PM, Pauli Virtanen wrote:
  What sparked this discussion (on Github) is that it is not
  possible to write duck-typed code that works correctly for:
 
  Do you mean one must start out with an 'asarray'? Or more than
  that?

 Starting with asarray won't work: sparse matrices are not subclasses
 of ndarray. Matrix-free linear operators are not such either.

 In Python code, you usually very seldom coerce your inputs to a
 specific type. The situation here is a bit as if there were two
 different stream object types in Python, and their .write() methods
 did completely different things, so that code doing I/O would need to
 always be careful with which type of a stream was in question.

  As I detailed in past discussion, the one thing I really do not
  like about the `matrix` design is that indexing always returns a
  matrix. I speculate this is the primary problem you're running
  into?

 The fact that reductions to 1D return 2D objects is also a problem,
 but the matrix multiplication vs. elementwise multiplication and
 division is also an issue.


Is there a need for every package in numpy/scipy to support matrices? I can
see leaving in the Matrix object for basic teaching/linear algebra, but
perhaps it would be reasonable for more advanced applications to forgo
support. That would fall into the class of not going out of the way to
accommodate subclasses of ndarray that override methods. I support that
approach in the long run because trying to keep all subclasses happy is
extra effort that could be better spent elsewhere.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Matthew Brett

Hi,

On Mon, Feb 10, 2014 at 1:52 PM, Charles R Harris
charlesr.har...@gmail.com wrote:



 On Mon, Feb 10, 2014 at 2:28 PM, Pauli Virtanen p...@iki.fi wrote:

 10.02.2014 23:13, Alan G Isaac kirjoitti:
  On 2/10/2014 4:03 PM, Pauli Virtanen wrote:
  What sparked this discussion (on Github) is that it is not
  possible to write duck-typed code that works correctly for:
 
  Do you mean one must start out with an 'asarray'? Or more than
  that?

 Starting with asarray won't work: sparse matrices are not subclasses
 of ndarray. Matrix-free linear operators are not such either.

 In Python code, you usually very seldom coerce your inputs to a
 specific type. The situation here is a bit as if there were two
 different stream object types in Python, and their .write() methods
 did completely different things, so that code doing I/O would need to
 always be careful with which type of a stream was in question.

  As I detailed in past discussion, the one thing I really do not
  like about the `matrix` design is that indexing always returns a
  matrix. I speculate this is the primary problem you're running
  into?

 The fact that reductions to 1D return 2D objects is also a problem,
 but the matrix multiplication vs. elementwise multiplication and
 division is also an issue.


 Is there a need for every package in numpy/scipy to support matrices? I can
 see leaving in the Matrix object for basic teaching/linear algebra, but
 perhaps it would be reasonable for more advanced applications to forgo
 support. That would fall into the class of not going out of the way to
 accommodate subclasses of ndarray that override methods. I support that
 approach in the long run because trying to keep all subclasses happy is
 extra effort that could be better spent elsewhere.

Yes, I bet there is a solution in that direction that everyone could live with.

Alex - yes - I think it would be hugely useful to write up this
discussion as a wiki page or a NEP or a wiki page that might become a
NEP.  It seems to me there is a great deal of agreement here which
could fruitfully be recorded.

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Pauli Virtanen

10.02.2014 23:40, Alan G Isaac kirjoitti:
 On 2/10/2014 4:28 PM, Pauli Virtanen wrote:
 Starting with asarray won't work: sparse matrices are not
 subclasses of ndarray.
 
 I was focused on the `matrix` object. For this object, an initial
 asarray is all it takes to use array code. (Or ... not?)  And it is
 a view, not a copy.
 
 I don't have the background to know how scipy ended up with a
 sparse matrix object instead of a sparse array object. In any case,
 it seems like a different question.

I think this is very relevant question, and I believe one of the main
motivations for the continuous reappearance of this discussion.

The existence of np.matrix messes up the general agreement on ndarray
semantics in Python. The meaning of very basic code such as

A * B
A.sum(0)
A[0]

where A and B are NxN matrices of some sort now depends on the types
of A and B. This makes writing duck typed code impossible when both
semantics are in play.

This is more of a community and ecosystem question rather than about
np.matrix and asarray().

I think the existence of np.matrix and its influence has set back the
development of a way to express generic linear algebra (dense, sparse,
matrix-free) algorithms in Python.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread alex

On Mon, Feb 10, 2014 at 4:42 PM, Alan G Isaac alan.is...@gmail.com wrote:
 On 2/10/2014 4:40 PM, alex wrote:
 I really want to remove it

 Can you articulate the problem created by its existence
 that leads you to this view?

In my opinion, Pauli has articulated these problems well in this thread.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Chris Barker

On Mon, Feb 10, 2014 at 1:13 PM, Alan G Isaac alan.is...@gmail.com wrote:

 Do you mean one must start out with an 'asarray'?
  Or more than that?


maybe np.asanyarray()

It's nice, at least in principle for duck-typed functions to return the
type they were handed. And this really is the primary issu ewith np.matrix
-- it takes some real effort to write code that preserves your objects as
the matrix type.

As I detailed in past discussion, the one thing
 I really do not like about the `matrix` design
 is that indexing always returns a matrix.


And that's the other one -- to be really nice and useful, I think we'd need
a row_vector and column_vector type. i.e if you iterate through a matrix,
you get a bunch of row_vector instances -- not a bunch of Nx1 matrixes.

But anyway -- there was a big ol' discussion about this a few years back --
my summary of that is:

1) not very many people use matrix
  1a) Those that do, often end up dropping it as their experience develops
  1b) It is a source of sonfusion -- some argue more confusion than it's
worth.
2) It might be more useful if it were substantially improved
  - some of the subclassing issues
  - vector types
  - ???
3) A number of people had some great ideas how to improve it.
4) Not a single person with both the skill set and the bandwidth to
actually do it has shown any interest for a long time.

Given (1) and (4) -- I can see that deprecation might seem to make sense.

However, I am perfectly willing to Accept Alan's assurance that it's a
useful teaching tool in some cases as is is. Note that I would argue that
it's NOT for newbies, but rather, useful if you want to provide
a computational environment where matrixes make sense, and the point is to
teach and work with those concepts, rather than to learn numpy in the
broader sense.

If the goal is to teach numpy for general use, I don't think you should
introduce the matrix object.

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Matthew Brett

Hi,

On Mon, Feb 10, 2014 at 2:11 PM, Pauli Virtanen p...@iki.fi wrote:
 10.02.2014 23:40, Alan G Isaac kirjoitti:
 On 2/10/2014 4:28 PM, Pauli Virtanen wrote:
 Starting with asarray won't work: sparse matrices are not
 subclasses of ndarray.

 I was focused on the `matrix` object. For this object, an initial
 asarray is all it takes to use array code. (Or ... not?)  And it is
 a view, not a copy.

 I don't have the background to know how scipy ended up with a
 sparse matrix object instead of a sparse array object. In any case,
 it seems like a different question.

 I think this is very relevant question, and I believe one of the main
 motivations for the continuous reappearance of this discussion.

 The existence of np.matrix messes up the general agreement on ndarray
 semantics in Python. The meaning of very basic code such as

 A * B
 A.sum(0)
 A[0]

 where A and B are NxN matrices of some sort now depends on the types
 of A and B. This makes writing duck typed code impossible when both
 semantics are in play.

That is a very convincing argument.

What would be the problems (apart from code compatibility) in making
scipy.sparse use the ndarray semantics?

Thanks,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Alan G Isaac

On 2/10/2014 5:11 PM, Pauli Virtanen wrote:
 The existence of np.matrix messes up the general agreement on ndarray
 semantics in Python. The meaning of very basic code such as

   A * B
   A.sum(0)
   A[0]

 where A and B are NxN matrices of some sort now depends on the types
 of A and B. This makes writing duck typed code impossible when both
 semantics are in play.


I'm just missing the point here; sorry.
Why isn't the right approach to require that
any object that wants to work with scipy
can be called  by `asarray` to guarantee
the core semantics? (And the matrix
object passes this test.)  For some objects
we can agree that `asarray` will coerce them.
(E.g., lists.)

I just do not see why scipy should care about
the semantics an object uses for interacting
with other objects of the same type.

Alan Isaac

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Pauli Virtanen

11.02.2014 00:17, Matthew Brett kirjoitti:
[clip]
 That is a very convincing argument.
 
 What would be the problems (apart from code compatibility) in making
 scipy.sparse use the ndarray semantics?

I'd estimate the effort it would take to convert scipy.sparse to ndarray
semantics is about a couple of afternoon hacks (normal, not
Ipython-size), so it should be doable.

Also, a shorthand for right-multiplication is probably necessary, as

A.T.dot(B.T).T

is unwieldy.

As far as backward compatibility goes: change from * to .dot would break
everyone's code. I suspect the rest of the changes have smaller impacts.

The code breakage is such that I don't think it can be easily done by
changing the behavior of csr_matrix. I've previously proposed adding
csr_array et al., and deprecating csr_matrix et al.. Not sure if the
*_matrix can ever be removed, but it would be useful to point new users
to use the interface with the ndarray convention.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Matthew Brett

Hi,

On Mon, Feb 10, 2014 at 2:33 PM, Pauli Virtanen p...@iki.fi wrote:
 11.02.2014 00:17, Matthew Brett kirjoitti:
 [clip]
 That is a very convincing argument.

 What would be the problems (apart from code compatibility) in making
 scipy.sparse use the ndarray semantics?

 I'd estimate the effort it would take to convert scipy.sparse to ndarray
 semantics is about a couple of afternoon hacks (normal, not
 Ipython-size), so it should be doable.

 Also, a shorthand for right-multiplication is probably necessary, as

 A.T.dot(B.T).T

 is unwieldy.

 As far as backward compatibility goes: change from * to .dot would break
 everyone's code. I suspect the rest of the changes have smaller impacts.

 The code breakage is such that I don't think it can be easily done by
 changing the behavior of csr_matrix. I've previously proposed adding
 csr_array et al., and deprecating csr_matrix et al.. Not sure if the
 *_matrix can ever be removed, but it would be useful to point new users
 to use the interface with the ndarray convention.

Yes, that seems very sensible.

Then what about Chuck's suggestion - np.matrix stays but it is
effectively an independent project that other parts of numpy or scipy
are not required to support.   Scipy.sparse switches to the ndarray
semantics and future subclasses of ndarray should also use ndarray
semantics?

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Pauli Virtanen

11.02.2014 00:31, Alan G Isaac kirjoitti:
 On 2/10/2014 5:11 PM, Pauli Virtanen wrote:
 The existence of np.matrix messes up the general agreement on ndarray
 semantics in Python. The meaning of very basic code such as

  A * B
  A.sum(0)
  A[0]

 where A and B are NxN matrices of some sort now depends on the types
 of A and B. This makes writing duck typed code impossible when both
 semantics are in play.

 I'm just missing the point here; sorry.
 Why isn't the right approach to require that
 any object that wants to work with scipy
 can be called  by `asarray` to guarantee
 the core semantics? (And the matrix
 object passes this test.)  For some objects
 we can agree that `asarray` will coerce them.
 (E.g., lists.)
 
 I just do not see why scipy should care about
 the semantics an object uses for interacting
 with other objects of the same type.

I have a couple of points:

(A)

asarray() coerces the input to a dense array. This you do not want to do
to sparse matrices or matrix-free linear operators, as many linear
algebra algorithms don't need to know the matrix entries.

(B)

Coercing input types is something that is seldom done in Python code,
since it breaks duck typing.

Usually, the interface is specified by assumed semantics of the input
objects. The user is then free to pass in mock objects that fulfill the
necessary subsection of the assumed interface.

(C)

This is not only about Scipy, but also a language design question:

Suppose someone, who is not a Python expert, wants to implement a
linear algebra algorithm in Python.

Will they write it using matrix or ndarray? (Note: np.matrix is not
uncommon on stackoverflow.)

Will someone who reads the code easily understand what it does (does *
stand for elementwise or matrix product etc)?

Can they easily make it work both with sparse and dense matrices?
Matrix-free operators? Does it work both for ndarray and np.matrix inputs?

(D)

The presence of np.matrix invites people to write code using the
np.matrix semantics. This can further lead to the code spitting out
dense results as np.matrix, and then it becomes difficult to follow
what sort of an object you have.

(E)

Some examples of the above semantics diaspora on scipy.sparse:

* Implementation of GMRES et al in Scipy. The implementation reinvents
  yet another set of semantics that it uses internally.

* scipy.sparse has mostly matrix semantics, but not completely, and the
  return values vary between matrix and ndarray


-- 
Pauli Virtanen


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread josef . pktd

On Mon, Feb 10, 2014 at 6:29 PM, Pauli Virtanen p...@iki.fi wrote:

 11.02.2014 00:31, Alan G Isaac kirjoitti:
  On 2/10/2014 5:11 PM, Pauli Virtanen wrote:
  The existence of np.matrix messes up the general agreement on ndarray
  semantics in Python. The meaning of very basic code such as
 
   A * B
   A.sum(0)
   A[0]
 
  where A and B are NxN matrices of some sort now depends on the types
  of A and B. This makes writing duck typed code impossible when both
  semantics are in play.
 
  I'm just missing the point here; sorry.
  Why isn't the right approach to require that
  any object that wants to work with scipy
  can be called  by `asarray` to guarantee
  the core semantics? (And the matrix
  object passes this test.)  For some objects
  we can agree that `asarray` will coerce them.
  (E.g., lists.)
 
  I just do not see why scipy should care about
  the semantics an object uses for interacting
  with other objects of the same type.

 I have a couple of points:

 (A)

 asarray() coerces the input to a dense array. This you do not want to do
 to sparse matrices or matrix-free linear operators, as many linear
 algebra algorithms don't need to know the matrix entries.

 (B)

 Coercing input types is something that is seldom done in Python code,
 since it breaks duck typing.

 Usually, the interface is specified by assumed semantics of the input
 objects. The user is then free to pass in mock objects that fulfill the
 necessary subsection of the assumed interface.


Almost all the code in scipy.stats and statsmodels starts with np.asarray.
The numpy doc standard has the term `array_like` to indicate things that
can be converted to a usable object by ndasarray.

ducktyping could be restricted to a very narrow category of ducks.

What about masked arrays and structured dtypes?
Because we cannot usefully convert them by asarray, we have to tell users
that they don't work with a function.
Our ducks that quack in the wrong way. ?

How do you handle list and other array_likes in sparse?

Josef



 (C)

 This is not only about Scipy, but also a language design question:

 Suppose someone, who is not a Python expert, wants to implement a
 linear algebra algorithm in Python.

 Will they write it using matrix or ndarray? (Note: np.matrix is not
 uncommon on stackoverflow.)

 Will someone who reads the code easily understand what it does (does *
 stand for elementwise or matrix product etc)?

 Can they easily make it work both with sparse and dense matrices?
 Matrix-free operators? Does it work both for ndarray and np.matrix inputs?

 (D)

 The presence of np.matrix invites people to write code using the
 np.matrix semantics. This can further lead to the code spitting out
 dense results as np.matrix, and then it becomes difficult to follow
 what sort of an object you have.

 (E)

 Some examples of the above semantics diaspora on scipy.sparse:

 * Implementation of GMRES et al in Scipy. The implementation reinvents
   yet another set of semantics that it uses internally.

 * scipy.sparse has mostly matrix semantics, but not completely, and the
   return values vary between matrix and ndarray


 --
 Pauli Virtanen


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread josef . pktd

On Mon, Feb 10, 2014 at 6:39 PM, josef.p...@gmail.com wrote:



 On Mon, Feb 10, 2014 at 6:29 PM, Pauli Virtanen p...@iki.fi wrote:

 11.02.2014 00:31, Alan G Isaac kirjoitti:
  On 2/10/2014 5:11 PM, Pauli Virtanen wrote:
  The existence of np.matrix messes up the general agreement on ndarray
  semantics in Python. The meaning of very basic code such as
 
   A * B
   A.sum(0)
   A[0]
 
  where A and B are NxN matrices of some sort now depends on the types
  of A and B. This makes writing duck typed code impossible when both
  semantics are in play.
 
  I'm just missing the point here; sorry.
  Why isn't the right approach to require that
  any object that wants to work with scipy
  can be called  by `asarray` to guarantee
  the core semantics? (And the matrix
  object passes this test.)  For some objects
  we can agree that `asarray` will coerce them.
  (E.g., lists.)
 
  I just do not see why scipy should care about
  the semantics an object uses for interacting
  with other objects of the same type.

 I have a couple of points:

 (A)

 asarray() coerces the input to a dense array. This you do not want to do
 to sparse matrices or matrix-free linear operators, as many linear
 algebra algorithms don't need to know the matrix entries.

 (B)

 Coercing input types is something that is seldom done in Python code,
 since it breaks duck typing.

 Usually, the interface is specified by assumed semantics of the input
 objects. The user is then free to pass in mock objects that fulfill the
 necessary subsection of the assumed interface.


 Almost all the code in scipy.stats and statsmodels starts with np.asarray.
 The numpy doc standard has the term `array_like` to indicate things that
 can be converted to a usable object by ndasarray.

 ducktyping could be restricted to a very narrow category of ducks.


I thought once it would be nice to have a flag on the classes that indicate
`array_semantic` versus `matrix_semantic` so it would be easy to check the
quack instead of the duck.

Josef




 What about masked arrays and structured dtypes?
 Because we cannot usefully convert them by asarray, we have to tell users
 that they don't work with a function.
 Our ducks that quack in the wrong way. ?

 How do you handle list and other array_likes in sparse?

 Josef



 (C)

 This is not only about Scipy, but also a language design question:

 Suppose someone, who is not a Python expert, wants to implement a
 linear algebra algorithm in Python.

 Will they write it using matrix or ndarray? (Note: np.matrix is not
 uncommon on stackoverflow.)

 Will someone who reads the code easily understand what it does (does *
 stand for elementwise or matrix product etc)?

 Can they easily make it work both with sparse and dense matrices?
 Matrix-free operators? Does it work both for ndarray and np.matrix inputs?

 (D)

 The presence of np.matrix invites people to write code using the
 np.matrix semantics. This can further lead to the code spitting out
 dense results as np.matrix, and then it becomes difficult to follow
 what sort of an object you have.

 (E)

 Some examples of the above semantics diaspora on scipy.sparse:

 * Implementation of GMRES et al in Scipy. The implementation reinvents
   yet another set of semantics that it uses internally.

 * scipy.sparse has mostly matrix semantics, but not completely, and the
   return values vary between matrix and ndarray


 --
 Pauli Virtanen


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Pauli Virtanen

11.02.2014 01:39, josef.p...@gmail.com kirjoitti:
[clip]
 Almost all the code in scipy.stats and statsmodels starts with np.asarray.
 The numpy doc standard has the term `array_like` to indicate things that
 can be converted to a usable object by ndasarray.
 
 ducktyping could be restricted to a very narrow category of ducks.

 What about masked arrays and structured dtypes?
 Because we cannot usefully convert them by asarray, we have to tell users
 that they don't work with a function.
 Our ducks that quack in the wrong way.?

The issue here is semantics for basic linear algebra operations, such as
matrix multiplication, that work for different matrix objects, including
ndarrays.

What is there now in scipy.sparse is influenced by np.matrix, and this
is proving to be sub-optimal, as it is incompatible with ndarrays.

 How do you handle list and other array_likes in sparse?

if isinstance(t, (list, tuple)): asarray(...)

Sure, np.matrix can be dealt with as an input too.

But as said, I'm not arguing so much about asarray'in np.matrices as
input, but the fact that agreement on the meaning of * in linear
algebra code in Python is muddled. This should be fixed, and deprecating
np.matrix would point the way.

(I also suspect that this argument has been raised before, but as long
as there's no canonical write-up...)

-- 
Pauli Virtanen


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Charles R Harris

On Mon, Feb 10, 2014 at 5:39 PM, Pauli Virtanen p...@iki.fi wrote:

 11.02.2014 01:39, josef.p...@gmail.com kirjoitti:
 [clip]
  Almost all the code in scipy.stats and statsmodels starts with
 np.asarray.
  The numpy doc standard has the term `array_like` to indicate things that
  can be converted to a usable object by ndasarray.
 
  ducktyping could be restricted to a very narrow category of ducks.
 
  What about masked arrays and structured dtypes?
  Because we cannot usefully convert them by asarray, we have to tell users
  that they don't work with a function.
  Our ducks that quack in the wrong way.?

 The issue here is semantics for basic linear algebra operations, such as
 matrix multiplication, that work for different matrix objects, including
 ndarrays.

 What is there now in scipy.sparse is influenced by np.matrix, and this
 is proving to be sub-optimal, as it is incompatible with ndarrays.

  How do you handle list and other array_likes in sparse?

 if isinstance(t, (list, tuple)): asarray(...)

 Sure, np.matrix can be dealt with as an input too.

 But as said, I'm not arguing so much about asarray'in np.matrices as
 input, but the fact that agreement on the meaning of * in linear
 algebra code in Python is muddled. This should be fixed, and deprecating
 np.matrix would point the way.

 (I also suspect that this argument has been raised before, but as long
 as there's no canonical write-up...)


This would require deprecating current sparse as well, no?

I could be convinced to follow this route if there were a pedagogic version
of a matrix type that was restricted to linear algebra available as a
separate project. It could even have some improvements, row and column
vectors, inv, etc, but would not be as full featured as numpy arrays. The
idea is that it would serve for teaching matrices rather than numerical
programming in python.  Hopefully that would satisfy Alan's teaching use
case. There is the danger of students getting tied to that restricted
implementation, but that may not be something to worry about for the sort
of students Alan is talking about.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Charles R Harris

On Mon, Feb 10, 2014 at 6:11 PM, Charles R Harris charlesr.har...@gmail.com
 wrote:




 On Mon, Feb 10, 2014 at 5:39 PM, Pauli Virtanen p...@iki.fi wrote:

 11.02.2014 01:39, josef.p...@gmail.com kirjoitti:
 [clip]
  Almost all the code in scipy.stats and statsmodels starts with
 np.asarray.
  The numpy doc standard has the term `array_like` to indicate things that
  can be converted to a usable object by ndasarray.
 
  ducktyping could be restricted to a very narrow category of ducks.
 
  What about masked arrays and structured dtypes?
  Because we cannot usefully convert them by asarray, we have to tell
 users
  that they don't work with a function.
  Our ducks that quack in the wrong way.?

 The issue here is semantics for basic linear algebra operations, such as
 matrix multiplication, that work for different matrix objects, including
 ndarrays.

 What is there now in scipy.sparse is influenced by np.matrix, and this
 is proving to be sub-optimal, as it is incompatible with ndarrays.

  How do you handle list and other array_likes in sparse?

 if isinstance(t, (list, tuple)): asarray(...)

 Sure, np.matrix can be dealt with as an input too.

 But as said, I'm not arguing so much about asarray'in np.matrices as
 input, but the fact that agreement on the meaning of * in linear
 algebra code in Python is muddled. This should be fixed, and deprecating
 np.matrix would point the way.

 (I also suspect that this argument has been raised before, but as long
 as there's no canonical write-up...)


 This would require deprecating current sparse as well, no?

 I could be convinced to follow this route if there were a pedagogic
 version of a matrix type that was restricted to linear algebra available as
 a separate project. It could even have some improvements, row and column
 vectors, inv, etc, but would not be as full featured as numpy arrays. The
 idea is that it would serve for teaching matrices rather than numerical
 programming in python.  Hopefully that would satisfy Alan's teaching use
 case. There is the danger of students getting tied to that restricted
 implementation, but that may not be something to worry about for the sort
 of students Alan is talking about.


Another possibility is to provide an infix matrix multiplication operator.

from sage.misc.decorators import infix_operator
@infix_operator('multiply')def dot(a,b):
return a.dot_product(b)
u=vector([1,2,3])
v=vector([5,4,3])print(u *dot* v)# = 22
@infix_operator('or')def plus(x,y):
return x*yprint(2 |plus| 4)# = 6

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Alan G Isaac

On 2/10/2014 7:39 PM, Pauli Virtanen wrote:
 The issue here is semantics for basic linear algebra operations, such as
 matrix multiplication, that work for different matrix objects, including
 ndarrays.


I'll see if I can restate my suggestion in another way,
because I do not feel you are responding to it.
(I might be wrong.)

What is a duck?  If you ask it to quack, it quacks.
OK, but what is it to quack?

Here, quacking is behaving like an ndarray (in your view,
as I understand it) when asked.  But how do we ask?
Your view (if I understand) is we ask via the operations
supported by ndarrays.  But maybe that is the wrong way
for the library to ask this question.

If so, then scipy libraries could ask an object
to behave like an an ndarray by calling, e.g.,
__asarray__ on it. It becomes the responsibility
of the object to return something appropriate
when __asarray__ is called. Objects that know how to do
this will provide __asarray__ and respond
appropriately.  Other types can be coerced if
that is the documented behavior (e.g., lists).
The libraries will then be written for a single
type of behavior.  What it means to quack is
pretty easily documented, and a matrix object
already knows how (e.g., m.A).  Presumably in
this scenario __asarray__ would return an object
that behaves like an ndarray and a converter for
turning the final result into the desired object
type (e.g., into a `matrix` if necessary).

Hope that clearer, even if it proves a terrible idea.

Alan Isaac

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Inheriting from ndarray Was: deprecate numpy.matrix

2014-02-10 Thread Alexander Belopolsky

On Mon, Feb 10, 2014 at 11:31 AM, Nathaniel Smith n...@pobox.com wrote:

 And in the long run, I
 think the goal is to move people away from inheriting from np.ndarray.


This is music to my ears, but what is the future of numpy.ma?  I understand
that numpy.oldnumeric.ma (the older version written without inheritance)
has been deprecated and slated to be removed in 1.9.  I also have seen some
attempts to bring ma functionality into the core ndarray object, but those
have not been successful as far as I can tell.

In general, what is the future of inheriting from np.ndarray?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Inheriting from ndarray Was: deprecate numpy.matrix

2014-02-10 Thread Charles R Harris

On Mon, Feb 10, 2014 at 9:01 PM, Alexander Belopolsky ndar...@mac.comwrote:


 On Mon, Feb 10, 2014 at 11:31 AM, Nathaniel Smith n...@pobox.com wrote:

 And in the long run, I
 think the goal is to move people away from inheriting from np.ndarray.


 This is music to my ears, but what is the future of numpy.ma?  I
 understand that numpy.oldnumeric.ma (the older version written without
 inheritance) has been deprecated and slated to be removed in 1.9.  I also
 have seen some attempts to bring ma functionality into the core ndarray
 object, but those have not been successful as far as I can tell.


numpy.ma is pretty much unmaintained at the moment, but it is pretty stable
and there are no plans to remove it. I'm kinda sad that moving the
functionality into numpy came to naught, but the time was short and the
disagreements were long. Hopefully we learned something in the attempt. I
don't know of any plans for masked arrays at the moment apart for waiting
to see what happens with dynd. I don't know what the chances of an overhaul
might be, or even if it could be made without disturbing current code. I
think we would have to offer something special to motivate folks to even
think of switching.



 In general, what is the future of inheriting from np.ndarray?


Well, we can't do much about it except discourage it. It is often a bad
design decision that people get sucked into because they want to borrow
some functionality. OTOH, there hasn't been an easy way to make use of
ndarray functionality for non-subclasses, and there is a *lot* to implement
to make an ndarray like object. Hopefully the new `__numpy_ufunc__`
attribute will make that easier.

If you have suggestions we'd like to hear them.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] record arrays with char*?

2014-02-10 Thread Christopher Jordan-Squire

I'm trying to wrap some C code using cython. The C code can take
inputs in two modes: dense inputs and sparse inputs. For dense inputs
the array indexing is naive. I have wrappers for that. In the sparse
case the matrix entries are typically indexed via names. So, for
example, the library documentation includes this as input you could
give:

struct
{
 char* ind;
 double val, wght;
} data[] = { {camera, 15, 2}, {necklace, 100, 20}, {vase, 90, 20},
 {pictures, 60, 30}, {tv, 40, 40}, {video, 15, 30}};

At the C level, data is passed to the function by directly giving its
address. (i.e. the C function takes as an argument (unsigned long)
data, casting the data pointer to an int)

I'd like to create something similar using record arrays, such as

np.array([(camera, 15, 2), (necklace, 100, 20), ... ],
dtype='object,f8,f8').

Unfortunately this fails because
(1) In cython I need to determine the address of the first element and
I can't take the address of a an input whose type I don't know (the
exact type will vary on the application, so more or fewer fields may
be in the C struct)
(2) I don't think a python object type is what I want--I need a char*
representation of the string. (Unfortunately I can't test this because
I haven't solved (1) -- how do you pass a record array around in
cython and/or take its address?)

Any suggestions?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

56 matches

Mail list logo