date:20140211

Re: [Numpy-discussion] Overlapping time series

2014-02-11 Thread Sturla Molden

Daniele Nicolodi  wrote:

> Sure, I realize that, thank for the clarification.  The arrays are quite
> small, then the three loops and the temporary take negligible time and
> memory in the overall processing.

If they are small, a Python loop would do the job as well. And if it
doesn't, it is just a matter of adding a couple of decorators to use Numba
JIT. 

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Overlapping time series

2014-02-11 Thread Daniele Nicolodi

On 11/02/2014 18:18, Sturla Molden wrote:
> Daniele Nicolodi  wrote:
> 
>> That's more or less my current approach (except that I use the fact that
>> the data is evenly samples to use np.where(np.diff(t1) != dt) to detect
>> the regions of continuous data, to avoid the loop.
> 
> I hope you realize that np.where(np.diff(t1) != dt) generates three loops,
> as well as two temporary arrays and one output array. If you do what I
> suggested, you get one loop and no temporaries. But you will need Numba or
> Cython to get full speed.

Sure, I realize that, thank for the clarification.  The arrays are quite
small, then the three loops and the temporary take negligible time and
memory in the overall processing.  I was lazy and I did not want to
write in Cython, however I didn't benchmark the exact solution you
propose written in pure python...

Cheers,
Daniele

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Geometrically defined masking arrays; how to optimize?

2014-02-11 Thread Daπid

A small improvement:

# Dimensions
N = 100
M = 200

# Coordinates of the centre
x0 = 40
y0 = 70

x, y = np.meshgrid(np.arange(N) - x0, np.arange(M) - y0, sparse=True)

# Center at 40, 70, radius 20
out_circle = (x * x + y * y < 20**2)

out_ring = out_circle - (x * x + y * y < 10**2)

plt.imshow(out_circle)
plt.figure()
plt.imshow(out_ring)
plt.show()

Using sparse you can avoid the repetition of the arrays, getting the same
functionality. Also, if the image is big, you can delegate the computations
of out_circle and out_ring to numexpr for speed.

/David.


On 11 February 2014 22:16, Daπid  wrote:

> Here it is an example:
>
> import numpy as np
> import pylab as plt
>
> N = 100
> M = 200
> x, y = np.meshgrid(np.arange(N), np.arange(M))
>
> # Center at 40, 70, radius 20
> x -= 40
> y -= 70
> out_circle = (x * x + y * y < 20**2)
>
> out_ring = out_circle - (x * x + y * y < 10**2)
>
> plt.imshow(out_circle)
> plt.figure()
> plt.imshow(out_ring)
> plt.show()
>
> Note that I have avoided taking the costly square root of each element by
> just taking the square of the radius. It can also be generalised to
> ellipses or rectangles, if you need it.
>
> Also, don't use 0 as a False value, and don't force it to be a 0. Instead,
> use "if not ring:"
>
>
> /David
>
>
>
>
>
> On 11 February 2014 21:56, Wolfgang Draxinger <
> wolfgang.draxin...@physik.uni-muenchen.de> wrote:
>
>> Hi,
>>
>> I implemented the following helper function to create masking arrays:
>>
>> def gen_mask(self, ring, sector):
>> in_segment = None
>> if ring == 0:
>> radius = self.scales()[0]
>> def in_center_circle(xy):
>> dx = xy[0] - self.xy[0]
>> dy = xy[1] - self.xy[1]
>> r = math.sqrt( dx**2 + dy**2 )
>> return r < radius
>> in_segment = in_center_circle
>> else:
>> angle = ( self.a_sector(sector, ring),
>>   self.a_sector( (sector+1) % self.n_sectors(),
>> ring) )
>> radii = self.scales()[ring:ring+1]
>> def in_segment_ring(xy):
>> dx = xy[0] - self.xy[0]
>> dy = xy[1] - self.xy[1]
>> r = math.sqrt( dx**2 + dy**2 )
>> a = math.atan2( dy, dx )
>> return r >= radii[0] and r < radii[1] \
>>and a >= angle[0] and a < angle[1]
>> in_segment = in_segment_ring
>>
>> width,height = self.arr.shape
>>
>> mask = numpy.zeros(shape=(width, height), dtype=numpy.bool)
>> for y in range(height):
>> for x in range(width):
>> mask[x][y] = in_segment((x,y))
>> return mask
>>
>> self.scales() generates a list of radius scaling factors.
>> self.a_sector() gives the dividing angle between sectors "sector" and
>> "sector+1" on the given ring.
>>
>> The function works, it generates the masks as I need it. The problem is
>> - of course - that it's quite slow due to the inner loops the perform
>> the geometrical test if the given element of the array self.arr is
>> withing the bounds of the ring-sector for which the mask is generated.
>>
>> I wonder if you guys have some ideas, on how I could accelerate this.
>> This can be perfectly well constructed by boolean combination of
>> elementary geometrical objects. For example a ring would be
>>
>> ring(p, r0, r1): disk(p, r1) xor disk(p, r0) # where r0 < r1
>>
>> The sector would then be
>>
>> ring_sector(p, r, s): ring(p, r, r + ...) and sector(p, s, s+1)
>>
>> I'd like to avoid doing this using some C helper routine. I'm looking
>> for something that is the most efficient when it comes to
>> "speedup"/"time invested to develop this".
>>
>>
>> Cheers,
>>
>> Wolfgang
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Geometrically defined masking arrays; how to optimize?

2014-02-11 Thread Daπid

Here it is an example:

import numpy as np
import pylab as plt

N = 100
M = 200
x, y = np.meshgrid(np.arange(N), np.arange(M))

# Center at 40, 70, radius 20
x -= 40
y -= 70
out_circle = (x * x + y * y < 20**2)

out_ring = out_circle - (x * x + y * y < 10**2)

plt.imshow(out_circle)
plt.figure()
plt.imshow(out_ring)
plt.show()

Note that I have avoided taking the costly square root of each element by
just taking the square of the radius. It can also be generalised to
ellipses or rectangles, if you need it.

Also, don't use 0 as a False value, and don't force it to be a 0. Instead,
use "if not ring:"


/David





On 11 February 2014 21:56, Wolfgang Draxinger <
wolfgang.draxin...@physik.uni-muenchen.de> wrote:

> Hi,
>
> I implemented the following helper function to create masking arrays:
>
> def gen_mask(self, ring, sector):
> in_segment = None
> if ring == 0:
> radius = self.scales()[0]
> def in_center_circle(xy):
> dx = xy[0] - self.xy[0]
> dy = xy[1] - self.xy[1]
> r = math.sqrt( dx**2 + dy**2 )
> return r < radius
> in_segment = in_center_circle
> else:
> angle = ( self.a_sector(sector, ring),
>   self.a_sector( (sector+1) % self.n_sectors(),
> ring) )
> radii = self.scales()[ring:ring+1]
> def in_segment_ring(xy):
> dx = xy[0] - self.xy[0]
> dy = xy[1] - self.xy[1]
> r = math.sqrt( dx**2 + dy**2 )
> a = math.atan2( dy, dx )
> return r >= radii[0] and r < radii[1] \
>and a >= angle[0] and a < angle[1]
> in_segment = in_segment_ring
>
> width,height = self.arr.shape
>
> mask = numpy.zeros(shape=(width, height), dtype=numpy.bool)
> for y in range(height):
> for x in range(width):
> mask[x][y] = in_segment((x,y))
> return mask
>
> self.scales() generates a list of radius scaling factors.
> self.a_sector() gives the dividing angle between sectors "sector" and
> "sector+1" on the given ring.
>
> The function works, it generates the masks as I need it. The problem is
> - of course - that it's quite slow due to the inner loops the perform
> the geometrical test if the given element of the array self.arr is
> withing the bounds of the ring-sector for which the mask is generated.
>
> I wonder if you guys have some ideas, on how I could accelerate this.
> This can be perfectly well constructed by boolean combination of
> elementary geometrical objects. For example a ring would be
>
> ring(p, r0, r1): disk(p, r1) xor disk(p, r0) # where r0 < r1
>
> The sector would then be
>
> ring_sector(p, r, s): ring(p, r, r + ...) and sector(p, s, s+1)
>
> I'd like to avoid doing this using some C helper routine. I'm looking
> for something that is the most efficient when it comes to
> "speedup"/"time invested to develop this".
>
>
> Cheers,
>
> Wolfgang
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Geometrically defined masking arrays; how to optimize?

2014-02-11 Thread Wolfgang Draxinger

Hi,

I implemented the following helper function to create masking arrays:

def gen_mask(self, ring, sector):
in_segment = None
if ring == 0:
radius = self.scales()[0]
def in_center_circle(xy):
dx = xy[0] - self.xy[0]
dy = xy[1] - self.xy[1]
r = math.sqrt( dx**2 + dy**2 )
return r < radius
in_segment = in_center_circle
else:
angle = ( self.a_sector(sector, ring),
  self.a_sector( (sector+1) % self.n_sectors(), ring) )
radii = self.scales()[ring:ring+1]
def in_segment_ring(xy):
dx = xy[0] - self.xy[0]
dy = xy[1] - self.xy[1]
r = math.sqrt( dx**2 + dy**2 )
a = math.atan2( dy, dx )
return r >= radii[0] and r < radii[1] \
   and a >= angle[0] and a < angle[1]
in_segment = in_segment_ring

width,height = self.arr.shape

mask = numpy.zeros(shape=(width, height), dtype=numpy.bool)
for y in range(height):
for x in range(width):
mask[x][y] = in_segment((x,y))
return mask

self.scales() generates a list of radius scaling factors.
self.a_sector() gives the dividing angle between sectors "sector" and 
"sector+1" on the given ring.

The function works, it generates the masks as I need it. The problem is 
- of course - that it's quite slow due to the inner loops the perform 
the geometrical test if the given element of the array self.arr is 
withing the bounds of the ring-sector for which the mask is generated.

I wonder if you guys have some ideas, on how I could accelerate this. 
This can be perfectly well constructed by boolean combination of 
elementary geometrical objects. For example a ring would be

ring(p, r0, r1): disk(p, r1) xor disk(p, r0) # where r0 < r1

The sector would then be

ring_sector(p, r, s): ring(p, r, r + ...) and sector(p, s, s+1)

I'd like to avoid doing this using some C helper routine. I'm looking 
for something that is the most efficient when it comes to 
"speedup"/"time invested to develop this".


Cheers,

Wolfgang
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Suggestions for GSoC Projects

2014-02-11 Thread Pauli Virtanen

Hi,

08.02.2014 06:16, Stéfan van der Walt kirjoitti:
> On 8 Feb 2014 04:51, "Ralf Gommers"  wrote:
>>
>>>  Members of the dipy team would also be interested.
>>
>> That's specifically for the spherical harmonics topic right?
> 
> Right. Spherical harmonics are used as bases in many of DiPy's
> reconstruction algorithms.

If help is needed with a GSoC project for scipy.special, I'm in
principle available to chip in co-mentoring, or just trying to help
answer questions.

Best,
Pauli

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Suggestions for GSoC Projects

2014-02-11 Thread Pauli Virtanen

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,

04.02.2014 20:30, jennifer stone kirjoitti:
> 3. As stated earlier, we have spherical harmonic functions (with
> much scope
>>> for dev) we are yet to have elliptical and cylindrical harmonic
>>> function, which may be developed.
>> 
>> This sounds very doable. How much work do you think would be
>> involved?
> 
> As Stefan so rightly pointed out, the function for spherical
> harmonic function, sph_harm at present calls lpmn thus evaluating
> all orders  gives me a feeling that it would be very well possible to avoid
> that by maybe avoiding the dependence on lpmn.
> 
> Further, we can introduce ellipsoidal harmonic functions of first
> kind and the second kind. I am confident about about the
> implementation of ellipsoidal H function of first kind but don't
> know much about the second kind. But I believe we can work it out
> in due course.And cylindrical harmonics can be carried out using
> Bessel functions.

It's not so often someone wants to work on scipy.special, so you'd be
welcome to improve it :)

The general structure of work on special functions goes as follows:

- - Check if there is a license-compatible implementation that someone
  has already written. This is usually not the case.

- - Find formulas for evaluating the function in terms of more primitive
  operations. (Ie. power series, asymptotic series, continued fractions,
  expansions in terms of other special functions, ...)

- - Determine the parameter region where the expansions converge
  in a floating point implementation, and select algorithms
  appropriately.

  Here it helps if you find a research paper where the author has
  already thought about what sort of an approach works best.

- - Life is usually made *much* easier thanks to Fredrik Johansson's
  prior work on arbitrary-precision arithmetic library mpmath

  http://code.google.com/p/mpmath/

  It can usually be used to check the "true" values of the functions.
  Also it contains implementations of algorithms for evaluating special
  functions, but because mpmath works with arbitrary precision numbers,
  these algorithms are not directly usable for floating-point
  calculations, as in floating point you cannot adjust the precision of
  the calculation dynamically.

  Moreover, the arbitrary-precision arithmetic can be slow compared
  to a more optimized floating point implementations.


Spherical harmonics might be a reasonable part of a GSoC proposal.
However, note that there exists also a *second* Legendre polynomial
function `lpmv`, which doesn't store the values of the previous N
functions. There's one numerical problem in the current way of
evaluation via ~Pmn(cos(theta)), which is that this approach seems to
lose relatively much precision at large orders for certain values of
theta. I don't recall now exactly how imprecise it becomes at large
orders, but it may be necessary to check.


Adding new special functions also sounds like an useful project. Here,
it helps if they are something that you expect you will need later on :)


There's also the case that several of the functions in Scipy have only
implementations for real-valued inputs, although the functions would
be defined on the whole complex plane. A list of the situation is here:

https://github.com/scipy/scipy/blob/master/scipy/special
/generate_ufuncs.py#L85

Lowercase d correspond to real-valued implementations, uppercase D to
complex-valued. I'm not at the moment completely sure which would have
the highest priority --- whether you need this or not really depends
on the application.


If you want additional ideas about possible things to fix in
scipy.special, take a look at this file:

https://github.com/scipy/scipy/blob/master/scipy/special/tests
/test_mpmath.py#L648

The entries marked @knownfailure* have some undiagnosed issues in the
implementation, which might be useful to look into. However: most of
these have to do with corner cases in hypergeometric functions. Trying
to address those is likely a risky GSoC topic, as the multi-argument
hyp* functions are challenging to evaluate in floating point. (mpmath
and Mathematica can evaluate them in most parameter regimes, but AFAIK
both require arbitrary-precision methods for this.)


So I think there would be a large number of possible things to do
here, and help would be appreciated.

- -- 
Pauli Virtanen
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.14 (GNU/Linux)

iEYEARECAAYFAlL6iwAACgkQ6BQxb7O0pWBfOgCfYHAB12N4FWDmrqx8/ORTBRps
pXYAoL3ufAiShe+0qTEGfEvrmDgr1X0p
=kAwF
-END PGP SIGNATURE-

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-11 Thread Matthew Brett

Hi,

On Tue, Feb 11, 2014 at 11:40 AM, Pauli Virtanen  wrote:
> 11.02.2014 21:20, alex kirjoitti:
> [clip]
>> In the spirit of offsetting this bias and because this thread is
>> lacking in examples of projects that use numpy.matrix, here's another
>> data point: cvxpy (https://github.com/cvxgrp/cvxpy) is a serious
>> active project that supports the numpy.matrix interface, for example
>> as in 
>> https://github.com/cvxgrp/cvxpy/tree/master/cvxpy/interface/numpy_interface.
>
> Here's some more data:
>
> http://nullege.com/codes/search?cq=numpy.matrix

These are quite revealing.  Some of the top hits are from old nipy
code.  We don't have any use of np.matrix in the current nipy trunk.

Another couple of hits are from the 'Pylon' package:
https://github.com/rwl/pylon - of this form:

scipy.io.mmwrite("./data/fDC.mtx", numpy.matrix(f_dc))

When I ran the code without the numpy.matrix call it was easy to see
how that happened:

In [4]: scipy.io.mmwrite('fDc.mtx', f_dc)
---
ValueErrorTraceback (most recent call last)
...
ValueError: expected matrix

Of course, what the message means is that you should pass a 2D array:

scipy.io.mmwrite('fDc.mtx', f_dc[None])

works fine.  So I think this is another example of confusion caused by
np.matrix.

https://github.com/scipy/scipy/pull/3310

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-11 Thread Pauli Virtanen

11.02.2014 21:20, alex kirjoitti:
[clip]
> In the spirit of offsetting this bias and because this thread is
> lacking in examples of projects that use numpy.matrix, here's another
> data point: cvxpy (https://github.com/cvxgrp/cvxpy) is a serious
> active project that supports the numpy.matrix interface, for example
> as in 
> https://github.com/cvxgrp/cvxpy/tree/master/cvxpy/interface/numpy_interface.

Here's some more data:

http://nullege.com/codes/search?cq=numpy.matrix

http://nullege.com/codes/search?cq=numpy.array

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-11 Thread alex

On Tue, Feb 11, 2014 at 12:05 PM, Matthew Brett  wrote:
> Hi,
>
> On Tue, Feb 11, 2014 at 8:55 AM,   wrote:
>>
>>
>> On Tue, Feb 11, 2014 at 11:25 AM, Matthew Brett 
>> wrote:
>>>
>>> Hi,
>>>
>>> On Tue, Feb 11, 2014 at 4:16 AM, Jacco Hoekstra - LR
>>>  wrote:
>>> > For our students, the matrix class is really appealing as we use a lot
>>> > of linear algebra and expressions with matrices simply look better with an
>>> > operator instead of a function:
>>> >
>>> > x=A.I*b
>>> >
>>> > looks much better than
>>> >
>>> > x = np.dot(np.linalg.inv(A),b)
>>>
>>> Yes, but:
>>>
>>> 1)  as Alan has mentioned, the dot method helps a lot.
>>>
>>> import numpy.linalg as npl
>>>
>>> x = npl.inv(A).dot(b)
>>>
>>> 2) Overloading the * operator means that you've lost * to do
>>> element-wise operations.  MATLAB has a different operator for that,
>>> '.*' - and it's very easy to forget the dot.  numpy makes this more
>>> explicit - you read 'dot' as 'dot'.
>>>
>>> > And this gets worse when the expression is longer:
>>> >
>>> > x = R.I*A*R*b
>>> >
>>> > becomes:
>>> >
>>> > x = np.dot( np.linalg.inv(R), np.dot(A, np.dot(R, b)))
>>>
>>> x = npl.inv(R).dot(A.dot(R.dot(b))
>>>
>>> > Actually, not being involved in earlier discussions on this topic, I was
>>> > a bit surprised by this and do not see the problem of having the matrix
>>> > class as nobody is obliged to use it. I tried to find the reasons, but did
>>> > not find it in the thread mentioned. Maybe someone could summarize the 
>>> > main
>>> > problem with keeping this class for newbies on this list like me?
>>> >
>>> > Anyway, I would say that there is a clear use for the matrix class:
>>> > readability of linear algebra code and hence a lower chance of errors, so
>>> > higher productivity.
>>>
>>> Yes, but it looks like there are not any developers on this list who
>>> write substantial code with the np.matrix class, so, if there is a
>>> gain in productivity, it is short-lived, soon to be replaced by a
>>> cost.
>>
>>
>> selection bias !
>>
>> I have seen lots of Matlab and GAUSS code written by academic
>> econometricians that have been programming for years but are not
>> "developers", code that is "inefficient" and numerically not very stable but
>> looks just like the formulas.
>
> Yes, I used to use matlab myself.
>
> There is certainly biased sampling on this list, so it's very
> difficult to say whether there is a large constituency of np.matrix
> users out there, it's possible.  I hope not, because I think they
> would honestly be better served with ndarray even if some of the lines
> in their script don't look quite as neat.

In the spirit of offsetting this bias and because this thread is
lacking in examples of projects that use numpy.matrix, here's another
data point: cvxpy (https://github.com/cvxgrp/cvxpy) is a serious
active project that supports the numpy.matrix interface, for example
as in 
https://github.com/cvxgrp/cvxpy/tree/master/cvxpy/interface/numpy_interface.

This project is from a somewhat famous Stanford lab
(http://www.stanford.edu/~boyd/index.html) and they are currently
running a MOOC (http://en.wikipedia.org/wiki/Massive_open_online_course)
for convex optimization
(https://class.stanford.edu/courses/Engineering/CVX101/Winter2014/about).
 This course currently uses a MATLAB-based modeling system
(http://cvxr.com/cvx/) but they are trying to switch to, or at least
support, Python.  But they have not yet been able to get cvxpy to a
mature enough state to use for their course.  Maybe the simple
numpy.matrix syntax has accelerated their progress so that they have
been able to reach cvxpy's currently somewhat usable state, or maybe
the extra work to deal with numpy.matrix has slowed their progress so
that we are using MATLAB instead of Python as the standard for
training Stanford undergrads and random internet MOOC participants, or
maybe their progress has little to do with numpy.matrix.  I'm not sure
which is the case.  But in any case, they use numpy.matrix explicitly
in their project.

Alex
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-11 Thread alex

On Tue, Feb 11, 2014 at 12:14 PM, Chris Barker  wrote:
> On Tue, Feb 11, 2014 at 8:25 AM, Matthew Brett 
> wrote:
>>
>> > Anyway, I would say that there is a clear use for the matrix class:
>> > readability of linear algebra code and hence a lower chance of errors, so
>> > higher productivity.
>>
>> Yes, but it looks like there are not any developers on this list who
>> write substantial code with the np.matrix class, so, if there is a
>> gain in productivity, it is short-lived, soon to be replaced by a
>> cost.
>
>
> to re-iterate:
>
> matrix is NOT for newbies, nor is it for higher productivity or fewer errors
> in production code -- the truth is, the ratio of linear algebra expressions
> like the above to element-wise, etc. operations that ndarray is well suited
> to is tiny  in "real" code. Did anyone that used MATLAB for significant
> problems get not get really really annoyed by all the typing of ".*" ?
>
> What matrix is good for is what someone here described as "domain specific
> language" -- i.e. that little bit of code that really is doing mostly linear
> algebra.

This point would suggest that the "domain specific language" defined
by the numpy.matrix semantics would be particularly useful for
representations of linear operators for which elementwise modification
might be less efficient (for example as in some implementations of
sparse matrices) or essentially unavailable (for example as in
matrix-free linear operators).

Alex
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-11 Thread josef . pktd

On Tue, Feb 11, 2014 at 12:14 PM, Chris Barker wrote:

> On Tue, Feb 11, 2014 at 8:25 AM, Matthew Brett wrote:
>
>> > Anyway, I would say that there is a clear use for the matrix class:
>> readability of linear algebra code and hence a lower chance of errors, so
>> higher productivity.
>>
>> Yes, but it looks like there are not any developers on this list who
>> write substantial code with the np.matrix class, so, if there is a
>> gain in productivity, it is short-lived, soon to be replaced by a
>> cost.
>>
>
> to re-iterate:
>
> matrix is NOT for newbies, nor is it for higher productivity or fewer
> errors in production code -- the truth is, the ratio of linear algebra
> expressions like the above to element-wise, etc. operations that ndarray is
> well suited to is tiny  in "real" code. Did anyone that used MATLAB for
> significant problems get not get really really annoyed by all the typing of
> ".*" ?
>
> What matrix is good for is what someone here described as "domain specific
> language" -- i.e. that little bit of code that really is
> doing mostly linera algebra. So it's a nice tool for teaching and
> experimenting with linear-algebra-based concepts.
>

http://www.mathworks.com/matlabcentral/fileexchange/27095-tsls-2sls/content/tsls.m
https://github.com/statsmodels/statsmodels/blob/master/statsmodels/sandbox/regression/gmm.py#L134

If I were not strict ndarray coder, I might prefer to wrap an ndarray and
use only matrices inside a function and ndarrays outside for the code that
is not linear algebra.



>
> To address Alan's question about duck-typing -- one of the things folks
> like to do with duck-typed functions and method is return the type that is
> passed in when possible: i.e use asanyarray(), rather than asarray() -- but
> this is really going to be broken with matrix, as the symantics have
> changed. So we could say "don't expect that to work with matrix", but that
> requires one of:
>
> 1) folks always use asarray() and return an array, rather than a matrix to
> the caller -- not too bad, folks that want matrix can use np.matrix to get
> it back (a bit ugly, though..) however, this means that any other array
> subclass will get mangled as well...
>

scipy.linalg has an arraywrap on input and output. (at least when I looked
a few years ago)
(statsmodels has a pandas wrapper that converts arguments and returns to
have ndarrays internally)

some packages have helper functions to make a consistent interface to
ndarrays and sparce "matrices"

scipy.stats still doesn't protect against masked arrays and nans.

IMO: that's life. Removing matrices from numpy doesn't make the problem go
away. Although the problem could be pushed to other packages.
But if nobody uses matrices, then we would have at least **one** problem
less.



>
> 2) folks use asanyarray(), and it will break, maybe painfully, when a
> matrix is passed in -- folks using matrixes would need to use .A when
> calling such functions. This really seems ripe for confusion.
>

There are many ndarray subclasses out there, and I have no idea how they
behave.
pandas.Series was until recently a ndarray subclass, that didn't quite
behave like one.
We had to fix some bugs in statsmodels where we accidentially use
asanyarray instead of asarray.

Josef


>
> The truth is that the "right" way to do all this is to
> have different operators, rather than different objects, but that's been
> tried and did not fly.
>
> -Chris
>
>
>
>
>
>
>
>
>
>
>
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R(206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115   (206) 526-6317   main reception
>
> chris.bar...@noaa.gov
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] record arrays with char*?

2014-02-11 Thread Chris Barker

On Tue, Feb 11, 2014 at 8:57 AM, Christopher Jordan-Squire
wrote:

> Thanks for the answers! My responses are inline.


my C is a bit weak, so forgive my misunderstanding, but:
>> {
>>  char* ind;
>>  double val, wght;
>> } data[] = { {"camera", 15, 2}, {"necklace", 100, 20}, {"vase", 90, 20},
>>  {"pictures", 60, 30}, {"tv", 40, 40}, {"video", 15,
30}};

Here is my C weakness -- what does this struct look like in memory? i.e is
that first element a pointer, and the actually string data is somewhere
else in memory? In which case, where? and how does that memory get managed?

Or are the bytes at the begining of the struct with the string data in it?

in the "real" case, you say the char* is a string representation of a hex
value for a memory address -- so you would know a priory exactly what the
length of that sting is, so you could use a numpy struct like:

('S8,f8,f8')

i.e 8 bytes to store the string at the begining of the struct.

then & the_array[i] would be the address of the beggining of the string --
which may be what you want.

HTH,
  -Chris







> >>
> >> At the C level, data is passed to the function by directly giving its
> >> address. (i.e. the C function takes as an argument (unsigned long)
> >> data, casting the data pointer to an int)
> >
> >
> > wow -- that's prone to error! but I"m still not sure which pointer you're
> > talking about -- a pointer to this struct?
> >
>
> I mean data. data is an array of structs, but in C that's
> (essentially) the same as a pointer to a struct. So data is the
> pointer I'm referring to.
>
> >> I'd like to create something similar using record arrays, such as
> >>
> >> np.array([("camera", 15, 2), ("necklace", 100, 20), ... ],
> >> dtype='object,f8,f8').
> >
> >
> >
> >>
> >> Unfortunately this fails because
> >> (1) In cython I need to determine the address of the first element and
> >> I can't take the address of a an input whose type I don't know (the
> >> exact type will vary on the application, so more or fewer fields may
> >> be in the C struct)
> >
> >
> > still a bit confused, but if this is types as an array in Cython, you
> should
> > be abel to do somethign like:
> >
> > &the_array[i]
> >
> > to get the address of the ith element.
> >
>
> To do that I need to be able to tell cython the type of the memory
> view I give. There are very few examples for non-primitive arrays in
> cython, so I'm not sure what that would look like. Or at least I
> *think* I need to do that, based on the cython errors I'm getting.
>
> >> (2) I don't think a python object type is what I want--I need a char*
> >> representation of the string. (Unfortunately I can't test this because
> >> I haven't solved (1) -- how do you pass a record array around in
> >> cython and/or take its address?)
> >
> >
> > well, and object type will give you a pointer to a pyobject. If you know
> for
> > sure that that pyobject is a string object (probably want a bytes object
> --
> > you son't want unicode here), then you should be abel to get the address
> of
> > the underlying char array. But that would require passing something
> > different off to the C code that the address of that element.
> >
> > You could use an unsigned long for that first field, as you are assuming
> > that in the C code anyway  but I don't hink there is a way in numpy to
> set
> > that to a pointer to a char allocated elsewhere -- where would it be
> > allocated?
> >
>
> The strings are char*, the unsigned long cast was simply to cast data
> to a memory address. (The specific setup of the library was passing a
> string with the memory location in hexadecimal. This seems weird, but
> it's because the C functions being called are an intermediary to a
> simple (third-party) virtual machine running a program compiled from
> another language. It doesn't deal with pointers directly, so the other
> language is just passed the memory address, in hex and as a string,
> for where the data resides. Along with the number of elements in the
> block of memory pointed to.)
>
> > So I would give up on expecting to store the struct directly in numpy
> array,
> > and rather, put something reasonable (maybe what you have above) in the
> > numpy array, and build the C struct you need from that rather than
> passing a
> > pointer in directly.
> >
>
> That sounds reasonable. I really wanted to avoid this because, as I
> mentioned above, I'm just trying to generate the data in numpy and
> pass it to this virtual machine running a program compiled from
> another language. The form of the struct depends on what was done in
> the other language. It could easily have more or fewer fields, have
> the fields reordered, etc.. I wanted to avoid having to write wrappers
> for all such possibilities and instead use numpy record arrays as
> mapping exactly to C structs. But if that's really the only way to go
> then I guess I'll have to write struct wrappers in cython.
>
> > -Chris
> >
> >
> >
> > --
> >
> > Christopher Barker, Ph.

Re: [Numpy-discussion] Overlapping time series

2014-02-11 Thread Sturla Molden

Daniele Nicolodi  wrote:

> That's more or less my current approach (except that I use the fact that
> the data is evenly samples to use np.where(np.diff(t1) != dt) to detect
> the regions of continuous data, to avoid the loop.

I hope you realize that np.where(np.diff(t1) != dt) generates three loops,
as well as two temporary arrays and one output array. If you do what I
suggested, you get one loop and no temporaries. But you will need Numba or
Cython to get full speed.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-11 Thread Chris Barker

On Tue, Feb 11, 2014 at 8:25 AM, Matthew Brett wrote:

> > Anyway, I would say that there is a clear use for the matrix class:
> readability of linear algebra code and hence a lower chance of errors, so
> higher productivity.
>
> Yes, but it looks like there are not any developers on this list who
> write substantial code with the np.matrix class, so, if there is a
> gain in productivity, it is short-lived, soon to be replaced by a
> cost.
>

to re-iterate:

matrix is NOT for newbies, nor is it for higher productivity or fewer
errors in production code -- the truth is, the ratio of linear algebra
expressions like the above to element-wise, etc. operations that ndarray is
well suited to is tiny  in "real" code. Did anyone that used MATLAB for
significant problems get not get really really annoyed by all the typing of
".*" ?

What matrix is good for is what someone here described as "domain specific
language" -- i.e. that little bit of code that really is
doing mostly linera algebra. So it's a nice tool for teaching and
experimenting with linear-algebra-based concepts.

To address Alan's question about duck-typing -- one of the things folks
like to do with duck-typed functions and method is return the type that is
passed in when possible: i.e use asanyarray(), rather than asarray() -- but
this is really going to be broken with matrix, as the symantics have
changed. So we could say "don't expect that to work with matrix", but that
requires one of:

1) folks always use asarray() and return an array, rather than a matrix to
the caller -- not too bad, folks that want matrix can use np.matrix to get
it back (a bit ugly, though..) however, this means that any other array
subclass will get mangled as well...

2) folks use asanyarray(), and it will break, maybe painfully, when a
matrix is passed in -- folks using matrixes would need to use .A when
calling such functions. This really seems ripe for confusion.

The truth is that the "right" way to do all this is to
have different operators, rather than different objects, but that's been
tried and did not fly.

-Chris

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-11 Thread Matthew Brett

Hi,

On Tue, Feb 11, 2014 at 8:55 AM,   wrote:
>
>
> On Tue, Feb 11, 2014 at 11:25 AM, Matthew Brett 
> wrote:
>>
>> Hi,
>>
>> On Tue, Feb 11, 2014 at 4:16 AM, Jacco Hoekstra - LR
>>  wrote:
>> > For our students, the matrix class is really appealing as we use a lot
>> > of linear algebra and expressions with matrices simply look better with an
>> > operator instead of a function:
>> >
>> > x=A.I*b
>> >
>> > looks much better than
>> >
>> > x = np.dot(np.linalg.inv(A),b)
>>
>> Yes, but:
>>
>> 1)  as Alan has mentioned, the dot method helps a lot.
>>
>> import numpy.linalg as npl
>>
>> x = npl.inv(A).dot(b)
>>
>> 2) Overloading the * operator means that you've lost * to do
>> element-wise operations.  MATLAB has a different operator for that,
>> '.*' - and it's very easy to forget the dot.  numpy makes this more
>> explicit - you read 'dot' as 'dot'.
>>
>> > And this gets worse when the expression is longer:
>> >
>> > x = R.I*A*R*b
>> >
>> > becomes:
>> >
>> > x = np.dot( np.linalg.inv(R), np.dot(A, np.dot(R, b)))
>>
>> x = npl.inv(R).dot(A.dot(R.dot(b))
>>
>> > Actually, not being involved in earlier discussions on this topic, I was
>> > a bit surprised by this and do not see the problem of having the matrix
>> > class as nobody is obliged to use it. I tried to find the reasons, but did
>> > not find it in the thread mentioned. Maybe someone could summarize the main
>> > problem with keeping this class for newbies on this list like me?
>> >
>> > Anyway, I would say that there is a clear use for the matrix class:
>> > readability of linear algebra code and hence a lower chance of errors, so
>> > higher productivity.
>>
>> Yes, but it looks like there are not any developers on this list who
>> write substantial code with the np.matrix class, so, if there is a
>> gain in productivity, it is short-lived, soon to be replaced by a
>> cost.
>
>
> selection bias !
>
> I have seen lots of Matlab and GAUSS code written by academic
> econometricians that have been programming for years but are not
> "developers", code that is "inefficient" and numerically not very stable but
> looks just like the formulas.

Yes, I used to use matlab myself.

There is certainly biased sampling on this list, so it's very
difficult to say whether there is a large constituency of np.matrix
users out there, it's possible.  I hope not, because I think they
would honestly be better served with ndarray even if some of the lines
in their script don't look quite as neat.

But in any case, I still don't think that dropping np.matrix is an
option in the short or even the medium term.  The discussion - I think
- is whether we should move towards standardizing to ndarray semantics
for clarity and simplicity of future development (and teaching),

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] record arrays with char*?

2014-02-11 Thread Christopher Jordan-Squire

Thanks for the answers! My responses are inline.

On Tue, Feb 11, 2014 at 8:36 AM, Chris Barker  wrote:
> On Mon, Feb 10, 2014 at 10:43 PM, Christopher Jordan-Squire
>  wrote:
>>
>> I'm trying to wrap some C code using cython. The C code can take
>> inputs in two modes: dense inputs and sparse inputs. For dense inputs
>> the array indexing is naive. I have wrappers for that. In the sparse
>> case the matrix entries are typically indexed via names. So, for
>> example, the library documentation includes this as input you could
>> give:
>>
>> struct
>> {
>>  char* ind;
>>  double val, wght;
>> } data[] = { {"camera", 15, 2}, {"necklace", 100, 20}, {"vase", 90, 20},
>>  {"pictures", 60, 30}, {"tv", 40, 40}, {"video", 15, 30}};
>>
>> At the C level, data is passed to the function by directly giving its
>> address. (i.e. the C function takes as an argument (unsigned long)
>> data, casting the data pointer to an int)
>
>
> wow -- that's prone to error! but I"m still not sure which pointer you're
> talking about -- a pointer to this struct?
>

I mean data. data is an array of structs, but in C that's
(essentially) the same as a pointer to a struct. So data is the
pointer I'm referring to.

>> I'd like to create something similar using record arrays, such as
>>
>> np.array([("camera", 15, 2), ("necklace", 100, 20), ... ],
>> dtype='object,f8,f8').
>
>
>
>>
>> Unfortunately this fails because
>> (1) In cython I need to determine the address of the first element and
>> I can't take the address of a an input whose type I don't know (the
>> exact type will vary on the application, so more or fewer fields may
>> be in the C struct)
>
>
> still a bit confused, but if this is types as an array in Cython, you should
> be abel to do somethign like:
>
> &the_array[i]
>
> to get the address of the ith element.
>

To do that I need to be able to tell cython the type of the memory
view I give. There are very few examples for non-primitive arrays in
cython, so I'm not sure what that would look like. Or at least I
*think* I need to do that, based on the cython errors I'm getting.

>> (2) I don't think a python object type is what I want--I need a char*
>> representation of the string. (Unfortunately I can't test this because
>> I haven't solved (1) -- how do you pass a record array around in
>> cython and/or take its address?)
>
>
> well, and object type will give you a pointer to a pyobject. If you know for
> sure that that pyobject is a string object (probably want a bytes object --
> you son't want unicode here), then you should be abel to get the address of
> the underlying char array. But that would require passing something
> different off to the C code that the address of that element.
>
> You could use an unsigned long for that first field, as you are assuming
> that in the C code anyway  but I don't hink there is a way in numpy to set
> that to a pointer to a char allocated elsewhere -- where would it be
> allocated?
>

The strings are char*, the unsigned long cast was simply to cast data
to a memory address. (The specific setup of the library was passing a
string with the memory location in hexadecimal. This seems weird, but
it's because the C functions being called are an intermediary to a
simple (third-party) virtual machine running a program compiled from
another language. It doesn't deal with pointers directly, so the other
language is just passed the memory address, in hex and as a string,
for where the data resides. Along with the number of elements in the
block of memory pointed to.)

> So I would give up on expecting to store the struct directly in numpy array,
> and rather, put something reasonable (maybe what you have above) in the
> numpy array, and build the C struct you need from that rather than passing a
> pointer in directly.
>

That sounds reasonable. I really wanted to avoid this because, as I
mentioned above, I'm just trying to generate the data in numpy and
pass it to this virtual machine running a program compiled from
another language. The form of the struct depends on what was done in
the other language. It could easily have more or fewer fields, have
the fields reordered, etc.. I wanted to avoid having to write wrappers
for all such possibilities and instead use numpy record arrays as
mapping exactly to C structs. But if that's really the only way to go
then I guess I'll have to write struct wrappers in cython.

> -Chris
>
>
>
> --
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R(206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115   (206) 526-6317   main reception
>
> chris.bar...@noaa.gov
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.or

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-11 Thread josef . pktd

On Tue, Feb 11, 2014 at 11:25 AM, Matthew Brett wrote:

> Hi,
>
> On Tue, Feb 11, 2014 at 4:16 AM, Jacco Hoekstra - LR
>  wrote:
> > For our students, the matrix class is really appealing as we use a lot
> of linear algebra and expressions with matrices simply look better with an
> operator instead of a function:
> >
> > x=A.I*b
> >
> > looks much better than
> >
> > x = np.dot(np.linalg.inv(A),b)
>
> Yes, but:
>
> 1)  as Alan has mentioned, the dot method helps a lot.
>
> import numpy.linalg as npl
>
> x = npl.inv(A).dot(b)
>
> 2) Overloading the * operator means that you've lost * to do
> element-wise operations.  MATLAB has a different operator for that,
> '.*' - and it's very easy to forget the dot.  numpy makes this more
> explicit - you read 'dot' as 'dot'.
>
> > And this gets worse when the expression is longer:
> >
> > x = R.I*A*R*b
> >
> > becomes:
> >
> > x = np.dot( np.linalg.inv(R), np.dot(A, np.dot(R, b)))
>
> x = npl.inv(R).dot(A.dot(R.dot(b))
>
> > Actually, not being involved in earlier discussions on this topic, I was
> a bit surprised by this and do not see the problem of having the matrix
> class as nobody is obliged to use it. I tried to find the reasons, but did
> not find it in the thread mentioned. Maybe someone could summarize the main
> problem with keeping this class for newbies on this list like me?
> >
> > Anyway, I would say that there is a clear use for the matrix class:
> readability of linear algebra code and hence a lower chance of errors, so
> higher productivity.
>
> Yes, but it looks like there are not any developers on this list who
> write substantial code with the np.matrix class, so, if there is a
> gain in productivity, it is short-lived, soon to be replaced by a
> cost.
>

selection bias !

I have seen lots of Matlab and GAUSS code written by academic
econometricians that have been programming for years but are not
"developers", code that is "inefficient" and numerically not very
stable but looks just like the formulas.

fast prototyping for code that is, at most used, for a few papers.

(just to avoid misunderstanding: there are also econometricians that are
"developers", and write code that is intended for reuse.)

(but maybe numpy users are all "developers")

Josef


>
> Cheers,
>
> Matthew
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] record arrays with char*?

2014-02-11 Thread Chris Barker

On Mon, Feb 10, 2014 at 10:43 PM, Christopher Jordan-Squire  wrote:

> I'm trying to wrap some C code using cython. The C code can take
> inputs in two modes: dense inputs and sparse inputs. For dense inputs
> the array indexing is naive. I have wrappers for that. In the sparse
> case the matrix entries are typically indexed via names. So, for
> example, the library documentation includes this as input you could
> give:
>
> struct
> {
>  char* ind;
>  double val, wght;
> } data[] = { {"camera", 15, 2}, {"necklace", 100, 20}, {"vase", 90, 20},
>  {"pictures", 60, 30}, {"tv", 40, 40}, {"video", 15, 30}};
>
> At the C level, data is passed to the function by directly giving its
> address. (i.e. the C function takes as an argument (unsigned long)
> data, casting the data pointer to an int)
>

wow -- that's prone to error! but I"m still not sure which pointer you're
talking about -- a pointer to this struct?

I'd like to create something similar using record arrays, such as
>
> np.array([("camera", 15, 2), ("necklace", 100, 20), ... ],
> dtype='object,f8,f8').
>

> Unfortunately this fails because
> (1) In cython I need to determine the address of the first element and
> I can't take the address of a an input whose type I don't know (the
> exact type will vary on the application, so more or fewer fields may
> be in the C struct)
>

still a bit confused, but if this is types as an array in Cython, you
should be abel to do somethign like:

&the_array[i]

to get the address of the ith element.

(2) I don't think a python object type is what I want--I need a char*
> representation of the string. (Unfortunately I can't test this because
> I haven't solved (1) -- how do you pass a record array around in
> cython and/or take its address?)
>

well, and object type will give you a pointer to a pyobject. If you know
for sure that that pyobject is a string object (probably want a bytes
object -- you son't want unicode here), then you should be abel to get the
address of the underlying char array. But that would require passing
something different off to the C code that the address of that element.

You could use an unsigned long for that first field, as you are assuming
that in the C code anyway  but I don't hink there is a way in numpy to set
that to a pointer to a char allocated elsewhere -- where would it
be allocated?

So I would give up on expecting to store the struct directly in numpy
array, and rather, put something reasonable (maybe what you have above) in
the numpy array, and build the C struct you need from that rather than
passing a pointer in directly.

-Chris

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-11 Thread Matthew Brett

Hi,

On Tue, Feb 11, 2014 at 4:16 AM, Jacco Hoekstra - LR
 wrote:
> For our students, the matrix class is really appealing as we use a lot of 
> linear algebra and expressions with matrices simply look better with an 
> operator instead of a function:
>
> x=A.I*b
>
> looks much better than
>
> x = np.dot(np.linalg.inv(A),b)

Yes, but:

1)  as Alan has mentioned, the dot method helps a lot.

import numpy.linalg as npl

x = npl.inv(A).dot(b)

2) Overloading the * operator means that you've lost * to do
element-wise operations.  MATLAB has a different operator for that,
'.*' - and it's very easy to forget the dot.  numpy makes this more
explicit - you read 'dot' as 'dot'.

> And this gets worse when the expression is longer:
>
> x = R.I*A*R*b
>
> becomes:
>
> x = np.dot( np.linalg.inv(R), np.dot(A, np.dot(R, b)))

x = npl.inv(R).dot(A.dot(R.dot(b))

> Actually, not being involved in earlier discussions on this topic, I was a 
> bit surprised by this and do not see the problem of having the matrix class 
> as nobody is obliged to use it. I tried to find the reasons, but did not find 
> it in the thread mentioned. Maybe someone could summarize the main problem 
> with keeping this class for newbies on this list like me?
>
> Anyway, I would say that there is a clear use for the matrix class: 
> readability of linear algebra code and hence a lower chance of errors, so 
> higher productivity.

Yes, but it looks like there are not any developers on this list who
write substantial code with the np.matrix class, so, if there is a
gain in productivity, it is short-lived, soon to be replaced by a
cost.

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Overlapping time series

2014-02-11 Thread Daniele Nicolodi

On 11/02/2014 15:38, Sturla Molden wrote:
> Daniele Nicolodi  wrote:
> 
>> I was probably not that clear: I have two 2xN arrays, one for each data
>> recording, one column for time (taken from the same clock for both
>> measurements) and one with data values.  Each array has some gaps.
> 
> If you want all subarrays where both timeseries are sampled, a single loop
> through the data
> can fix that. First find the smallest common timestamp. This is the first
> starting point. Then loop and follow the timestamps. As long as they are in
> synch, do just continue. If one timeseries suddenly skips forward (i.e.
> there is a gap), you have an end point. Then slice between the start and
> the end point, and append the view array to a list. Follow the timeseries
> that did not skip until timestamps are synchronous again, and you have the
> next starting point. Then just continue like this until the the two
> timeseries are exhausted. It is an O(n) strategy, so it will not be
> inefficient. If you are worried about the loop and performance, both Numba
> and Cython can transform this into C speed. Numba takes less effort to use.
> But Python loops are actually much faster than most scientists used to
> Matlab and the like would expect. 

Thanks Sturla.

That's more or less my current approach (except that I use the fact that
the data is evenly samples to use np.where(np.diff(t1) != dt) to detect
the regions of continuous data, to avoid the loop.

Cheers,
Daniele

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Overlapping time series

2014-02-11 Thread Joris Van den Bossche

2014-02-11 14:55 GMT+01:00 Andreas Hilboll :

> On 11.02.2014 14:47, Daniele Nicolodi wrote:
> > On 11/02/2014 14:41, Andreas Hilboll wrote:
> >> On 11.02.2014 14:22, Daniele Nicolodi wrote:
> >>> On 11/02/2014 14:10, Andreas Hilboll wrote:
>  On 11.02.2014 14:08, Daniele Nicolodi wrote:
> > Hello,
> >
> > I have two time series (2xN dimensional arrays) recorded on the same
> > time basis, but each with it's own dead times (and start and end
> > recording times).  I would like to obtain two time series containing
> > only the time overlapping segments of the data.
> >
> > Does numpy or scipy offer something that may help in this?
> >
> > I can imagine strategies about how to approach the problem, but none
> > that would be efficient.  Ideas?
> 
>  Take a look at pandas.  It has built-in time series functionality.
> >>>
> >>> Even using Pandas (and I would like to avoid to have to depend on it)
> it
> >>> is not clear to me how I would achieve what I want.  Am I missing
> something?
> >>
> >> If the two time series are pandas.Series objects and are called s1 and
> s2:
> >>
> >> new1 = s1.ix[s2.dropna().index].dropna()
> >> new2 = s2.ix[s1.dropna().index].dropna()
> >> new1 = new1.ix[s2.dropna().index].dropna()
> >>
> >> Looks hackish, so there might be a more elegant solution.  For further
> >> questions about how to use pandas, please look at the pydata mailing
> >> list or stackoverflow.
> >
> > Correct me if I'm wrong, but this assumes that missing data points are
> > represented with Nan.  In my case missing data points are just missing.
>
> pandas doesn't care.
>
> In pandas, you could simply do something like this (assuming the time is
set as the index):

pd.concat([s1, s2], axis=1)

and then remove the nan's (where the index was not overlapping) or use
`join='inner'`

Joris


> Andreas.
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Overlapping time series

2014-02-11 Thread Sturla Molden

Daniele Nicolodi  wrote:

> I was probably not that clear: I have two 2xN arrays, one for each data
> recording, one column for time (taken from the same clock for both
> measurements) and one with data values.  Each array has some gaps.

If you want all subarrays where both timeseries are sampled, a single loop
through the data
can fix that. First find the smallest common timestamp. This is the first
starting point. Then loop and follow the timestamps. As long as they are in
synch, do just continue. If one timeseries suddenly skips forward (i.e.
there is a gap), you have an end point. Then slice between the start and
the end point, and append the view array to a list. Follow the timeseries
that did not skip until timestamps are synchronous again, and you have the
next starting point. Then just continue like this until the the two
timeseries are exhausted. It is an O(n) strategy, so it will not be
inefficient. If you are worried about the loop and performance, both Numba
and Cython can transform this into C speed. Numba takes less effort to use.
But Python loops are actually much faster than most scientists used to
Matlab and the like would expect. 

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-11 Thread Alan G Isaac

On 2/11/2014 5:25 AM, Pauli Virtanen wrote:
>   the ndarray is also lacking some useful things, as
> you point out. But I think the right solution would be to stuff
> the required additions into ndarray, rather than retaining the
> otherwise incompatible np.matrix as a crutch.

Given that we now have the `dot` method,
if we could get the other convenience attributes
even as methods (say .I(), .H(), and .mpow())
that would greatly reduce the need for `matrix`.

Just to be clear, the `matrix` object is not *really*
the issue in the discussion of the scipy library,
right? Since it already knows how to be seen as an ndarray,
the library can always work with m.A when doing any
linear algebra.  From what I've read in this thread,
the real issues for scipy seem to lie with the sparse
matrix objects... ?

Alan

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Overlapping time series

2014-02-11 Thread RayS

At 06:07 AM 2/11/2014, you wrote:
>On 11/02/2014 14:56, Sturla Molden wrote:
> > Daniele Nicolodi  wrote:
> >
> >> Correct me if I'm wrong, but this assumes that missing data points are
> >> represented with Nan.  In my case missing data points are just missing.
> >
> > Then your data cannot be stored in a 2 x N array as you indicated.
>
>I was probably not that clear: I have two 2xN arrays, one for each data
>recording, one column for time (taken from the same clock for both
>measurements) and one with data values.  Each array has some gaps.

gaps at the ends I assume...
use numpy.where() with the time channel as the condition

- Ray



___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Overlapping time series

2014-02-11 Thread Daniele Nicolodi

On 11/02/2014 14:56, Sturla Molden wrote:
> Daniele Nicolodi  wrote:
> 
>> Correct me if I'm wrong, but this assumes that missing data points are
>> represented with Nan.  In my case missing data points are just missing.
> 
> Then your data cannot be stored in a 2 x N array as you indicated. 

I was probably not that clear: I have two 2xN arrays, one for each data
recording, one column for time (taken from the same clock for both
measurements) and one with data values.  Each array has some gaps.

Cheers,
Daniele

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Overlapping time series

2014-02-11 Thread RayS




> On 11.02.2014 14:08, Daniele Nicolodi wrote:
>> Hello,
>>
>> I have two time series (2xN dimensional arrays) recorded on the same
>> time basis, but each with it's own dead times (and start and end
>> recording times).  I would like to obtain two time series containing
>> only the time overlapping segments of the data.
>>
>> Does numpy or scipy offer something that may help in this?
>>
>> I can imagine strategies about how to approach the problem, but none
>> that would be efficient.  Ideas?



or?

st1 = 5
>>> np.concatenate((a1[:,st1:],a2[:,:a1.shape[1]-st1]))
array([[ 5,  6,  7,  8,  9],
   [15, 16, 17, 18, 19],
   [ 5,  6,  7,  8,  9],
   [15, 16, 17, 18, 19]])

- Ray
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Overlapping time series

2014-02-11 Thread Sturla Molden

Daniele Nicolodi  wrote:

> Correct me if I'm wrong, but this assumes that missing data points are
> represented with Nan.  In my case missing data points are just missing.

Then your data cannot be stored in a 2 x N array as you indicated. 

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Overlapping time series

2014-02-11 Thread RayS




> On 11.02.2014 14:08, Daniele Nicolodi wrote:
>> Hello,
>>
>> I have two time series (2xN dimensional arrays) recorded on the same
>> time basis, but each with it's own dead times (and start and end
>> recording times).  I would like to obtain two time series containing
>> only the time overlapping segments of the data.
>>
>> Does numpy or scipy offer something that may help in this?
>>
>> I can imagine strategies about how to approach the problem, but none
>> that would be efficient.  Ideas?


What is the gate/tach, ie pointer to start/stop?
I work with both tachometers and EKGs and do similar windowing, 
usually just using gates as slices so as not to make copies.
I also found this interesting and bookmarked it 
http://www.johnvinyard.com/blog/?p=268 which you might like.
Just to be clear, you have 2 2D arrays and want a 4x(N-m) shape, like 
mixing two stereo tracks?


>>> a1 = np.arange(0,20).reshape((2,-1))
>>> a2 = np.arange(5,25).reshape((2,-1))
>>> np.concatenate((a1,a2))
array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
   [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
   [ 5,  6,  7,  8,  9, 10, 11, 12, 13, 14],
   [15, 16, 17, 18, 19, 20, 21, 22, 23, 24]])
>>> st1 = 2
>>> end1 = 7
>>> st2 = 1
>>> end2 = 6
>>> np.concatenate((a1[:,st1:end1],a2[:,st2:end2]))
array([[ 2,  3,  4,  5,  6],
   [12, 13, 14, 15, 16],
   [ 6,  7,  8,  9, 10],
   [16, 17, 18, 19, 20]])


- Ray
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Overlapping time series

2014-02-11 Thread Andreas Hilboll

On 11.02.2014 14:47, Daniele Nicolodi wrote:
> On 11/02/2014 14:41, Andreas Hilboll wrote:
>> On 11.02.2014 14:22, Daniele Nicolodi wrote:
>>> On 11/02/2014 14:10, Andreas Hilboll wrote:
 On 11.02.2014 14:08, Daniele Nicolodi wrote:
> Hello,
>
> I have two time series (2xN dimensional arrays) recorded on the same
> time basis, but each with it's own dead times (and start and end
> recording times).  I would like to obtain two time series containing
> only the time overlapping segments of the data.
>
> Does numpy or scipy offer something that may help in this?
>
> I can imagine strategies about how to approach the problem, but none
> that would be efficient.  Ideas?

 Take a look at pandas.  It has built-in time series functionality.
>>>
>>> Even using Pandas (and I would like to avoid to have to depend on it) it
>>> is not clear to me how I would achieve what I want.  Am I missing something?
>>
>> If the two time series are pandas.Series objects and are called s1 and s2:
>>
>> new1 = s1.ix[s2.dropna().index].dropna()
>> new2 = s2.ix[s1.dropna().index].dropna()
>> new1 = new1.ix[s2.dropna().index].dropna()
>>
>> Looks hackish, so there might be a more elegant solution.  For further
>> questions about how to use pandas, please look at the pydata mailing
>> list or stackoverflow.
> 
> Correct me if I'm wrong, but this assumes that missing data points are
> represented with Nan.  In my case missing data points are just missing.

pandas doesn't care.

Andreas.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Overlapping time series

2014-02-11 Thread Sturla Molden

Daniele Nicolodi  wrote:

> I can imagine strategies about how to approach the problem, but none
> that would be efficient.  Ideas?

I would just loop from the start and loop from the end and find out where
to clip. Then slice in between. 

If Python loops take too much time, JIT compile them with Numba. 

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Overlapping time series

2014-02-11 Thread Daniele Nicolodi

On 11/02/2014 14:41, Andreas Hilboll wrote:
> On 11.02.2014 14:22, Daniele Nicolodi wrote:
>> On 11/02/2014 14:10, Andreas Hilboll wrote:
>>> On 11.02.2014 14:08, Daniele Nicolodi wrote:
 Hello,

 I have two time series (2xN dimensional arrays) recorded on the same
 time basis, but each with it's own dead times (and start and end
 recording times).  I would like to obtain two time series containing
 only the time overlapping segments of the data.

 Does numpy or scipy offer something that may help in this?

 I can imagine strategies about how to approach the problem, but none
 that would be efficient.  Ideas?
>>>
>>> Take a look at pandas.  It has built-in time series functionality.
>>
>> Even using Pandas (and I would like to avoid to have to depend on it) it
>> is not clear to me how I would achieve what I want.  Am I missing something?
> 
> If the two time series are pandas.Series objects and are called s1 and s2:
> 
> new1 = s1.ix[s2.dropna().index].dropna()
> new2 = s2.ix[s1.dropna().index].dropna()
> new1 = new1.ix[s2.dropna().index].dropna()
> 
> Looks hackish, so there might be a more elegant solution.  For further
> questions about how to use pandas, please look at the pydata mailing
> list or stackoverflow.

Correct me if I'm wrong, but this assumes that missing data points are
represented with Nan.  In my case missing data points are just missing.

Cheers,
Daniele

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Overlapping time series

2014-02-11 Thread Andreas Hilboll

On 11.02.2014 14:22, Daniele Nicolodi wrote:
> On 11/02/2014 14:10, Andreas Hilboll wrote:
>> On 11.02.2014 14:08, Daniele Nicolodi wrote:
>>> Hello,
>>>
>>> I have two time series (2xN dimensional arrays) recorded on the same
>>> time basis, but each with it's own dead times (and start and end
>>> recording times).  I would like to obtain two time series containing
>>> only the time overlapping segments of the data.
>>>
>>> Does numpy or scipy offer something that may help in this?
>>>
>>> I can imagine strategies about how to approach the problem, but none
>>> that would be efficient.  Ideas?
>>
>> Take a look at pandas.  It has built-in time series functionality.
> 
> Even using Pandas (and I would like to avoid to have to depend on it) it
> is not clear to me how I would achieve what I want.  Am I missing something?

If the two time series are pandas.Series objects and are called s1 and s2:

new1 = s1.ix[s2.dropna().index].dropna()
new2 = s2.ix[s1.dropna().index].dropna()
new1 = new1.ix[s2.dropna().index].dropna()

Looks hackish, so there might be a more elegant solution.  For further
questions about how to use pandas, please look at the pydata mailing
list or stackoverflow.

HTH,
Andreas.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Overlapping time series

2014-02-11 Thread Daniele Nicolodi

On 11/02/2014 14:10, Andreas Hilboll wrote:
> On 11.02.2014 14:08, Daniele Nicolodi wrote:
>> Hello,
>>
>> I have two time series (2xN dimensional arrays) recorded on the same
>> time basis, but each with it's own dead times (and start and end
>> recording times).  I would like to obtain two time series containing
>> only the time overlapping segments of the data.
>>
>> Does numpy or scipy offer something that may help in this?
>>
>> I can imagine strategies about how to approach the problem, but none
>> that would be efficient.  Ideas?
> 
> Take a look at pandas.  It has built-in time series functionality.

Even using Pandas (and I would like to avoid to have to depend on it) it
is not clear to me how I would achieve what I want.  Am I missing something?

Cheers,
Daniele

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Overlapping time series

2014-02-11 Thread Andreas Hilboll

On 11.02.2014 14:08, Daniele Nicolodi wrote:
> Hello,
> 
> I have two time series (2xN dimensional arrays) recorded on the same
> time basis, but each with it's own dead times (and start and end
> recording times).  I would like to obtain two time series containing
> only the time overlapping segments of the data.
> 
> Does numpy or scipy offer something that may help in this?
> 
> I can imagine strategies about how to approach the problem, but none
> that would be efficient.  Ideas?

Take a look at pandas.  It has built-in time series functionality.

Andreas.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Overlapping time series

2014-02-11 Thread Daniele Nicolodi

Hello,

I have two time series (2xN dimensional arrays) recorded on the same
time basis, but each with it's own dead times (and start and end
recording times).  I would like to obtain two time series containing
only the time overlapping segments of the data.

Does numpy or scipy offer something that may help in this?

I can imagine strategies about how to approach the problem, but none
that would be efficient.  Ideas?

Thank you.  Best,
Daniele
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-11 Thread Sturla Molden

Pauli Virtanen  wrote:

> [1] http://fperez.org/py4science/numpy-pep225/numpy-pep225.html

I have to agree with Robert Kern here. One operator that we can (ab)use for
matrix multiplication would suffice.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-11 Thread Jacco Hoekstra - LR

For our students, the matrix class is really appealing as we use a lot of 
linear algebra and expressions with matrices simply look better with an 
operator instead of a function:

x=A.I*b

looks much better than 

x = np.dot(np.linalg.inv(A),b) 

And this gets worse when the expression is longer:

x = R.I*A*R*b 

becomes:

x = np.dot( np.linalg.inv(R), np.dot(A, np.dot(R, b)))

Actually, not being involved in earlier discussions on this topic, I was a bit 
surprised by this and do not see the problem of having the matrix class as 
nobody is obliged to use it. I tried to find the reasons, but did not find it 
in the thread mentioned. Maybe someone could summarize the main problem with 
keeping this class for newbies on this list like me?

Anyway, I would say that there is a clear use for the matrix class: readability 
of linear algebra code and hence a lower chance of errors, so higher 
productivity.  

Just my 2cts,
Jacco Hoekstra

P.S. Also:   A = np.mat("2 3 ; -1 0") is very handy!

-Original Message-
From: numpy-discussion-boun...@scipy.org 
[mailto:numpy-discussion-boun...@scipy.org] On Behalf Of Pauli Virtanen
Sent: dinsdag 11 februari 2014 12:47
To: numpy-discussion@scipy.org
Subject: Re: [Numpy-discussion] deprecate numpy.matrix

Sturla Molden  gmail.com> writes:
> Pauli Virtanen  iki.fi> wrote:
> > It is not a good thing that there is no well defined "domain 
> > specific language" for matrix algebra in Python.
> 
> Perhaps Python should get some new operators?

It might still be possible to advocate for this in core Python, even though the 
ship has sailed long ago.

Some previous discussion:

[1] http://fperez.org/py4science/numpy-pep225/numpy-pep225.html
[2] http://www.python.org/dev/peps/pep-0225/
[3] http://www.python.org/dev/peps/pep-0211/

(My own take would be that one extra operator is enough for most purposes, and 
would be easier to push for.)

[clip]
> On the serious side, I don't think there really is a good solution to 
> this problem at all.

This is true. However, I'd prefer to have one solution over several conflicting 
ones.

--
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-11 Thread Pauli Virtanen

Sturla Molden  gmail.com> writes:
> Pauli Virtanen  iki.fi> wrote:
> > It is not a good thing that there is no well defined
> > "domain specific language" for matrix algebra in Python.
> 
> Perhaps Python should get some new operators?

It might still be possible to advocate for this in core Python,
even though the ship has sailed long ago.

Some previous discussion:

[1] http://fperez.org/py4science/numpy-pep225/numpy-pep225.html
[2] http://www.python.org/dev/peps/pep-0225/
[3] http://www.python.org/dev/peps/pep-0211/

(My own take would be that one extra operator is enough for most
purposes, and would be easier to push for.)

[clip]
> On the serious side, I don't think there really is a good solution to this
> problem at all.

This is true. However, I'd prefer to have one solution over several
conflicting ones.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-11 Thread Sturla Molden

Pauli Virtanen  wrote:

> It is not a good thing that there is no well defined
> "domain specific language" for matrix algebra in Python.

Perhaps Python should get some new operators?

__dot__
__cross__
__kronecker__

Obviously, using them in real Python code would require unicode.

;-)

On the serious side, I don't think there really is a good solution to this
problem at all.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-11 Thread Eelco Hoogendoorn

My 2pc:

I personally hardly ever use matrix, even in linear algebra dense code. It
can be nice though, to use matrix semantics within a restricted scope. When
I first came to numpy, the ability to choose linear algebra versus array
semantics seemed like a really neat thing to me; though in practice the
array semantics are so much more useful you really don't mind having to
write the occasional np.dot.

There seems to be some resistance form the developer side in having to
maintain this architecture, which I cannot comment on, but from a user
perspective, I am perfectly fine with the way things are.


On Tue, Feb 11, 2014 at 11:16 AM, Todd  wrote:

>
> On Feb 11, 2014 3:23 AM, "Alan G Isaac"  wrote:
> >
> > On 2/10/2014 7:39 PM, Pauli Virtanen wrote:
> > > The issue here is semantics for basic linear algebra operations, such
> as
> > > matrix multiplication, that work for different matrix objects,
> including
> > > ndarrays.
> >
> >
> > I'll see if I can restate my suggestion in another way,
> > because I do not feel you are responding to it.
> > (I might be wrong.)
> >
> > What is a duck?  If you ask it to quack, it quacks.
> > OK, but what is it to quack?
> >
> > Here, quacking is behaving like an ndarray (in your view,
> > as I understand it) when asked.  But how do we ask?
> > Your view (if I understand) is we ask via the operations
> > supported by ndarrays.  But maybe that is the wrong way
> > for the library to ask this question.
> >
> > If so, then scipy libraries could ask an object
> > to behave like an an ndarray by calling, e.g.,
> > __asarray__ on it. It becomes the responsibility
> > of the object to return something appropriate
> > when __asarray__ is called. Objects that know how to do
> > this will provide __asarray__ and respond
> > appropriately.  Other types can be coerced if
> > that is the documented behavior (e.g., lists).
> > The libraries will then be written for a single
> > type of behavior.  What it means to "quack" is
> > pretty easily documented, and a matrix object
> > already knows how (e.g., m.A).  Presumably in
> > this scenario __asarray__ would return an object
> > that behaves like an ndarray and a converter for
> > turning the final result into the desired object
> > type (e.g., into a `matrix` if necessary).
> >
> > Hope that clearer, even if it proves a terrible idea.
> >
> > Alan Isaac
>
> I don't currently use the matrix class, but having taken many linear
> algebra classes I can see the appeal, and if I end up teaching the subject
> I think I would appreciate having it available.
>
> On the other hand, I certainly can see the possibility for confusion, and
> I don't think it is something that should be used unless someone has a
> really good reason.
>
> So I come out somewhere in the middle here.  So, although this may end up
> being a terrible idea, I would like to purpose what I think is a
> compromise: instead of just removing matrix, split it out into a scikit.
> That way, it it's still available for those who need it, but will be less
> likely to be used accidentally, and won't be interfering with the rest of
> numpy and scipy development.
>
> Specifically, I would split matrix into a scikit, while in the same
> release deprecate np.matrix.  They can then exist in parallel for a few
> releases to allow code to be ported away from it.
>
> However, I would suggest that before the split, all linear algebra
> routines should be available as functions or methods in numpy proper.
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-11 Thread Pauli Virtanen

Alan G Isaac  gmail.com> writes:
[clip]
> Here, quacking is behaving like an ndarray (in your view,
> as I understand it) when asked.  But how do we ask?
> Your view (if I understand) is we ask via the operations
> supported by ndarrays.  But maybe that is the wrong way
> for the library to ask this question.

It is not a good thing that there is no well defined
"domain specific language" for matrix algebra in Python.

Rather, some code is written with one convention and other
code with a different convention. The conventions disagree
on how to express basic operations, such as matrix
multiplication.

Moreover, the ndarray is also lacking some useful things, as
you point out. But I think the right solution would be to stuff
the required additions into ndarray, rather than retaining the
otherwise incompatible np.matrix as a crutch.

> If so, then scipy libraries could ask an object
> to behave like an an ndarray by calling, e.g.,
> __asarray__ on it. It becomes the responsibility
> of the object to return something appropriate
> when __asarray__ is called. Objects that know how to do
> this will provide __asarray__ and respond
> appropriately.

Another way to achieve similar thing as your suggestion is to add
a coercion function in the vein of scipy.sparse.aslinearoperator.
It could deal with known-failure cases (np.matrix, scipy.sparse matrices)
and for the rest just assume the object satisfies the ndarray API
and pass them through.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Inheriting from ndarray Was: deprecate numpy.matrix

2014-02-11 Thread Todd

On Feb 11, 2014 5:01 AM, "Alexander Belopolsky"  wrote:
>
>
> On Mon, Feb 10, 2014 at 11:31 AM, Nathaniel Smith  wrote:
>>
>> And in the long run, I
>> think the goal is to move people away from inheriting from np.ndarray.
>
>
> This is music to my ears,

There are a lot of units (meter, foot, built, etc.) packages that use
inheriting from numpy to handle their units.  I wouldn't call it a
shortcut, it is a common design decision.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-11 Thread Todd

On Feb 11, 2014 3:23 AM, "Alan G Isaac"  wrote:
>
> On 2/10/2014 7:39 PM, Pauli Virtanen wrote:
> > The issue here is semantics for basic linear algebra operations, such as
> > matrix multiplication, that work for different matrix objects, including
> > ndarrays.
>
>
> I'll see if I can restate my suggestion in another way,
> because I do not feel you are responding to it.
> (I might be wrong.)
>
> What is a duck?  If you ask it to quack, it quacks.
> OK, but what is it to quack?
>
> Here, quacking is behaving like an ndarray (in your view,
> as I understand it) when asked.  But how do we ask?
> Your view (if I understand) is we ask via the operations
> supported by ndarrays.  But maybe that is the wrong way
> for the library to ask this question.
>
> If so, then scipy libraries could ask an object
> to behave like an an ndarray by calling, e.g.,
> __asarray__ on it. It becomes the responsibility
> of the object to return something appropriate
> when __asarray__ is called. Objects that know how to do
> this will provide __asarray__ and respond
> appropriately.  Other types can be coerced if
> that is the documented behavior (e.g., lists).
> The libraries will then be written for a single
> type of behavior.  What it means to "quack" is
> pretty easily documented, and a matrix object
> already knows how (e.g., m.A).  Presumably in
> this scenario __asarray__ would return an object
> that behaves like an ndarray and a converter for
> turning the final result into the desired object
> type (e.g., into a `matrix` if necessary).
>
> Hope that clearer, even if it proves a terrible idea.
>
> Alan Isaac

I don't currently use the matrix class, but having taken many linear
algebra classes I can see the appeal, and if I end up teaching the subject
I think I would appreciate having it available.

On the other hand, I certainly can see the possibility for confusion, and I
don't think it is something that should be used unless someone has a really
good reason.

So I come out somewhere in the middle here.  So, although this may end up
being a terrible idea, I would like to purpose what I think is a
compromise: instead of just removing matrix, split it out into a scikit.
That way, it it's still available for those who need it, but will be less
likely to be used accidentally, and won't be interfering with the rest of
numpy and scipy development.

Specifically, I would split matrix into a scikit, while in the same release
deprecate np.matrix.  They can then exist in parallel for a few releases to
allow code to be ported away from it.

However, I would suggest that before the split, all linear algebra routines
should be available as functions or methods in numpy proper.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

44 matches

Mail list logo