Re: [Numpy-discussion] Matching 0-d arrays and NumPy scalars

2008-02-21 Thread Damian Eads
In MATLAB, scalars are 1x1 arrays, and thus they can be indexed. There 
have been situations in my use of Numpy when I would have liked to index 
scalars to make my code more general.

It's not a very pressing issue for me but it is an interesting issue. 
Whenever I index an array with a sequence or slice I'm guaranteed to get 
another array out. This consistency is nice.

In [1]: A=numpy.random.rand(10)

In [2]: A[range(0,1)]
Out[2]: array([ 0.88109759])

In [3]: A[slice(0,1)]
Out[3]: array([ 0.88109759])

In [3]: A[[0]]
Out[3]: array([ 0.88109759])

However, when I index an array with an integer, I can get either a 
sequence or a scalar out.

In [4]: c1=A[0]
Out[4]: 0.88109759

In [5]: B=numpy.random.rand(5,5)

In [5]: c2=B[0]
Out[5]: array([ 0.81589633,  0.9762584 ,  0.7231,  0.12700816, 
0.40653243])

Although c1 and c2 were derived by integer-indexing two different arrays 
of doubles, one is a sequence and the other is a scalar. This lack of 
consistency might be confusing to some people, and I'd imagine it 
occasionally results in programming errors.

Damian

Travis E. Oliphant wrote:
 Hi everybody,
 
 In writing some generic code, I've encountered situations where it would 
 reduce code complexity to allow NumPy scalars to be indexed in the 
 same number of limited ways, that 0-d arrays support.
 
 For example, 0-d arrays can be indexed with
 
 * Boolean masks
 * Ellipses x[...]  and x[..., newaxis]
 * Empty tuple x[()]
 
 I think that numpy scalars should also be indexable in these particular 
 cases as well (read-only of course,  i.e. no setting of the value would 
 be possible).
 
 This is an easy change to implement, and I don't think it would cause 
 any backward compatibility issues.
 
 Any opinions from the list?
 
 
 Best regards,
 
 -Travis O.

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Matching 0-d arrays and NumPy scalars

2008-02-21 Thread Travis E. Oliphant
Konrad Hinsen wrote:
 On 21.02.2008, at 08:41, Francesc Altet wrote:

   
 Well, it seems like a non-intrusive modification, but I like the  
 scalars
 to remain un-indexable, mainly because it would be useful to raise an
 error when you are trying to index them.  In fact, I thought that when
 you want a kind of scalar but indexable, you should use a 0-d array.
 

 I agree. In fact, I'd rather see NumPy scalars move towards Python  
 scalars rather than towards NumPy arrays in behaviour.
A good balance should be sought.  I agree that improvements are needed, 
especially because much behavior is still just a side-effect of how 
things were implemented rather than specifically intentional.
 In particular,  
 their nasty habit of coercing everything they are combined with into  
 arrays is still my #1 source of compatibility problems with porting  
 code from Numeric to NumPy. I end up converting NumPy scalars to  
 Python scalars explicitly in lots of places.
   
This bit, for example, comes from the fact that most of the math on 
scalars still uses ufuncs for their implementation.  The numpy scalars 
could definitely use some improvements.  

However, I think my proposal for limited indexing capabilities should be 
considered separately from coercion behavior of NumPy scalars.  NumPy 
scalars are intentionally different from Python scalars, and I see this 
difference growing due to where Python itself is going.  For example, 
the int/long unification is going to change the ability for numpy.int to 
inherit from int.   I could also forsee the Python float being an 
instance of a Decimal object or some other infinite precision float at 
some point which would prevent inheritance for the numpy.float object.   

The legitimate question is *how* different should they really be in each 
specific case.

-Travis



___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] finding eigenvectors etc

2008-02-21 Thread Zachary Pincus

Hi all,


How are you using the values? How significant are the differences?



i am using these eigenvectors to do PCA on a set of images(of faces).I
sort the eigenvectors in descending order of their eigenvalues and
this is multiplied with the  (orig data of some images viz a matrix)to
obtain a facespace.


I've dealt with similar issues a lot -- performing PCA on data where  
the dimensionality of the data is much greater than the number of  
data points. (Like images.)


In this case, the maximum number of non-trivial eigenvectors of the  
covariance matrix of the data is min(dimension_of_data,  
number_of_data_points), so one always runs into the zero-eigenvalue  
problem; the matrix is thus always ill-conditioned, but that's not a  
problem in these cases.


Nevertheless, if you've got (say) 100 images that are each 100x100  
pixels, to do PCA in the naive way you need to make a 1x1  
covariance matrix and then decompose it into 10 eigenvectors and  
values just to get out the 100 non-trivial ones. That's a lot of  
computation wasted calculating noise! Fortunately, there are better  
ways. One is to perform the SVD on the 100x1 data matrix. Let the  
centered (mean-subtracted) data matrix be D, then the SVD provides  
matrices U, S, and V'. IIRC, the eigenvalues of D'D (the covariance  
matrix of interest) are then packed along the first dimension of V',  
and the eigenvalues are the square of the values in S.


But! There's an even faster way (from my testing). The trick is that  
instead of calculating the 1x1 outer covariance matrix D'D,  
or doing the SVD on D, one can calculate the 100x100 inner  
covariance matrix DD' and perform the eigen-decomposition thereon and  
then trivially transform those eigenvalues and vectors to the ones of  
the D'D matrix. This computation is often substantially faster than  
the SVD.


Here's how it works:
Let D, our re-centered data matrix, be of shape (n, k) -- that is, n  
data points in k dimensions.
We know that D has a singular value decomposition D = USV' (no need  
to calculate the SVD though; just enough to know it exists).

From this, we can rewrite the covariance matrices:
D'D = VS'SV'
DD' = USS'U'

Now, from the SVD, we know that S'S and SS' are diagonal matrices,  
and V and U (and V' and U') form orthogonal bases. One way to write  
the eigen-decomposition of a matrix is A = QLQ', where Q is  
orthogonal and L is diagonal. Since the eigen-decomposition is unique  
(up to a permutation of the columns of Q and L), we know that V must  
therefore contain the eigenvectors of D'D in its columns, and U must  
contain the eigenvectors of DD' in its columns. This is the origin of  
the SVD recipe for that I gave above.


Further, let S_hat, of shape (n, k) be the elementwise reciprocal of  
S (i.e. SS_hat = I of shape (m, n) and S_hatS = I of shape (n, m),  
where I is the identity matrix).

Then, we can solve for U or V in terms of the other:
V = D'US_hat'
U = DVS_hat

So, to get the eigenvectors and eigenvalues of D'D, we just calculate  
DD' and then apply the symmetric eigen-decomposition (symmetric  
version is faster, and DD' is symmetric) to get eigenvectors U and  
eigenvalues L. We know that L=SS', so S_hat = 1/sqrt(L) (where the  
sqrt is taken elementwise, of course). So, the eigenvectors we're  
looking for are:

V = D'US_hat
Then, the principal components (eigenvectors) are in the columns of V  
(packed along the second dimension of V).


Fortunately, I've packaged this all up into a python module for PCA  
that takes care of this all. It's attached.


Zach Pincus

Postdoctoral Fellow, Lab of Dr. Frank Slack
Molecular, Cellular and Developmental Biology
Yale University

# Copyright 2007 Zachary Pincus
# 
# This is free software; you can redistribute it and/or modify
# it under the terms of the Python License version 2.4 as published by the
# Python Software Foundation.


import numpy

def pca(data, algorithm='eig'):
  pca(data) - mean, pcs, norm_pcs, variances, positions, norm_positions
  Perform Principal Components Analysis on a set of n data points in k
  dimensions. The data array must be of shape (n, k).
  This function returns the mean data point, the principal components (of
  shape (p, k), where p is the number pf principal components: p=min(n,k)),
  the normalized principal components, where each component is normalized by
  the data's standard deviation along that component (shape (p, k)), the
  variance each component represents (shape (p,)), the position of each data
  point along each component (shape (n, p)), and the position of each data
  point along each normalized component (shape (n, p)).
  
  The optional algorithm parameter can be either 'svd' to perform PCA with 
  the singular value decomposition, or 'eig' to use a symmetric eigenvalue
  decomposition. Empirically, eig is faster on the datasets I have tested.
  
  
  data = numpy.asarray(data)
  mean = data.mean(axis = 0)
  centered = data - 

Re: [Numpy-discussion] Matching 0-d arrays and NumPy scalars

2008-02-21 Thread Damian Eads
While we are on the subject of indexing... I use xranges all over the 
place because I tend to loop over big data sets. Thus I try avoid to 
avoid allocating large chunks of memory unnecessarily with range. While 
I try to be careful not to let xranges propagate to the ndarray's [] 
operator, there have been a few times when I've made a mistake. Is there 
any reason why adding support for xrange indexing would be a bad thing 
to do? All one needs to do is convert the xrange to a slice object in 
__getitem__. I've written some simple code to do this conversion in 
Python (note that in C, one can access the start, end, and step of an 
xrange object very easily.)

def xrange_to_slice(ind):
 
 Converts an xrange object to a slice object.
 
 retval = slice(None, None, None)

 if type(ind) == XRangeType:
 # Grab a string representation of the xrange object, which takes
 # any of the forms: xrange(a), xrange(a,b), xrange(a,b,s).
 # Break it apart into a, b, and s.
 sind = str(ind)
 xr_params = [int(s) for s in 
sind[(sind.find('(')+1):sind.find(')')].split(,)]
 retval = apply(slice, xr_params)
 else:
 raise TypeError(Index must be an xrange object!)
 #endif
 return retval



On another note, I think it would be great if we added support for a 
find function, which takes a boolean array A, and returns the indices 
corresponding to True, but over A's flat view. In many cases, indexing 
with a boolean array is all one needs, making find unnecessary. However, 
I've encountered cases where computing the boolean array was 
computationally burdensome, the boolean arrays were large, and the 
result was needed many times throughout the broader computation. For 
many of my problems, storing away the flat index array uses a lot less 
memory than storing the boolean index arrays.

I frequently define a function like

def find(A):
  return numpy.where(A.flat)[0]

Certainly, we'd need a find with more error checking, and one that 
handles the case when a list of booleans is passed (or a list of lists). 
Conceivably, one might try to index a non-flat array with the result of 
find. To deal with this, find could return a place holder object that 
the index operator checks for. Just an idea.

--

I also think it'd be really useful to have a function that's like arange 
in that it supports floats/doubles, and also like xrange in that 
elements are only generated on demand.

It could be implemented as a generator as shown below.

def axrange(start, stop=None, step=1.0):
 if stop == None:
 stop = start
 start = 0.0
 #endif
 (start, stop, step) = (numpy.float64(start), numpy.float64(stop), 
numpy.float64(step))

 for i in xrange(0,numpy.ceil((stop-start)/step)):
 yield numpy.float64(start + step * i)
 #endfor

Or, as a class,

class axrangeiter:

 def __init__(self, rng):
 An iterator over an axrange object.
 self.rng = rng
 self.i = 0

 def next(self):
 Returns the next float in the sequence.
 if self.i = len(self.rng):
 raise StopIteration()
 self.i += 1
 return self.rng[self.i-1]

class axrange:

 def __init__(self, *args):
 
 axrange(stop)
 axrange(start, stop, [step])

 An axrange object is an iterable numerical sequence between
 start and stop. Similar to arange, there are 
n=ceil((stop-start)/step)
 elements in the sequence. Elements are generated on demand, 
which can
 be more memory efficient.
 
 if len(args) == 1:
 self.start = numpy.float64(0.0)
 self.stop = numpy.float64(args[0])
 self.step = numpy.float64(1.0)
 elif len(args) == 2:
 self.start = numpy.float64(args[0])
 self.stop = numpy.float64(args[1])
 self.step = numpy.float64(1.0)
 elif len(args) == 3:
 self.start = numpy.float64(args[0])
 self.stop = numpy.float64(args[1])
 self.step = numpy.float64(args[2])
 else:
 raise TypeError(axrange requires 3 arguments.)
 #endif
 self.len = max(int(numpy.ceil((self.stop-self.start)/self.step)),0)

 def __len__(self):
 return self.len

 def __getitem__(self, i):
 return numpy.float64(self.start + self.step * i)

 def __iter__(self):
 return axrangeiter(self)

 def __repr__(self):
 if self.start == 0.0 and self.step == 1.0:
 return axrange(%s) % str(self.stop)
 elif self.step == 1.0:
 return axrange(%s,%s) % (str(self.start), str(self.stop))
 else:
 return axrange(%s,%s,%s) % (str(self.start), 
str(self.stop), str(self.step))
 #endif

Travis E. Oliphant wrote:
 Hi everybody,
 
 In writing some generic code, I've encountered situations where it would 
 reduce code complexity to 

Re: [Numpy-discussion] Matching 0-d arrays and NumPy scalars

2008-02-21 Thread Travis E. Oliphant
Damian Eads wrote:
 While we are on the subject of indexing... I use xranges all over the 
 place because I tend to loop over big data sets. Thus I try avoid to 
 avoid allocating large chunks of memory unnecessarily with range. While 
 I try to be careful not to let xranges propagate to the ndarray's [] 
 operator, there have been a few times when I've made a mistake. Is there 
 any reason why adding support for xrange indexing would be a bad thing 
 to do? All one needs to do is convert the xrange to a slice object in 
 __getitem__. I've written some simple code to do this conversion in 
 Python (note that in C, one can access the start, end, and step of an 
 xrange object very easily.)
   
I think something like this could be supported.   Basically, 
interpreting an xrange object as a slice object would be my presumed 
behavior.

-Travis O.

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Matching 0-d arrays and NumPy scalars

2008-02-21 Thread Konrad Hinsen
On Feb 21, 2008, at 16:03, Travis E. Oliphant wrote:

 However, I think my proposal for limited indexing capabilities  
 should be
 considered separately from coercion behavior of NumPy scalars.  NumPy
 scalars are intentionally different from Python scalars, and I see  
 this
 difference growing due to where Python itself is going.  For example,
 the int/long unification is going to change the ability for  
 numpy.int to
 inherit from int.

True, but this is almost an implementation detail.

What I see as more fundamental is the behaviour of Python container  
objects (lists, sets, etc.). If you add an object to a container and  
then access it as an element of the container, you get the original  
object (or something that behaves like the original object) without  
any trace of the container itself. I don't see why arrays should  
behave differently from all the other Python container objects -  
certainly not because it would be rather easy to implement.

NumPy has been inspired a lot by array languages like APL or Matlab.  
In those languages, everything is an array, and plain numbers that  
would be scalars elsewhere are considered 0-d arrays. Python is not  
an array language but an OO language with the more general concepts  
of containers, sequences, iterators, etc. Arrays are just one kind of  
container object among many others, so they should respect the common  
behaviours of containers.

Konrad.

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Matching 0-d arrays and NumPy scalars

2008-02-21 Thread Alan G Isaac
On Thu, 21 Feb 2008, Konrad Hinsen apparently wrote:

 What I see as more fundamental is the behaviour of Python container 
 objects (lists, sets, etc.). If you add an object to a container and 
 then access it as an element of the container, you get the original 
 object (or something that behaves like the original object) without 
 any trace of the container itself. 

I am not a CS type, but your statement seems related to
a matrix behavior that I find bothersome and unnatural::

 M = N.mat('1 2;3 4')
 M[0]
matrix([[1, 2]])
 M[0][0]
matrix([[1, 2]])

I do not think anyone has really defended this behavior,
*but* the reply to me when I suggested that a matrix 
contains arrays and we should see that in its behavior
was that, no, a matrix is a container of matrices so this is 
what you get.

So a possible problem with your phrasing of the argument
(from a non-CS, user point of view)
is that it fails to address what is actually contained
(as opposed to what you might wish were contained).

Apologies if this proves OT.

Cheers,
Alan Isaac



___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Matching 0-d arrays and NumPy scalars

2008-02-21 Thread Konrad Hinsen
On Feb 21, 2008, at 18:08, Alan G Isaac wrote:

 I do not think anyone has really defended this behavior,
 *but* the reply to me when I suggested that a matrix
 contains arrays and we should see that in its behavior
 was that, no, a matrix is a container of matrices so this is
 what you get.

I can't say much about matrices in NumPy as I never used them, nor  
tried to understand them. The example you give looks weird to me.

 So a possible problem with your phrasing of the argument
 (from a non-CS, user point of view)
 is that it fails to address what is actually contained
 (as opposed to what you might wish were contained).

Most Python container objects contain arbitrary objects. Arrays are  
an exception (the exception being justified by the enormous memory  
and performance gains) in that all its elements are necessarily of  
identical type. A float64 array is thus a container of float64 values.

BTW, I am not a CS type either, my background is in physics. I see  
myself on the user side as well.

Konrad.



___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Matching 0-d arrays and NumPy scalars

2008-02-21 Thread Eric Firing
Travis E. Oliphant wrote:
 Hi everybody,
 
 In writing some generic code, I've encountered situations where it would 
 reduce code complexity to allow NumPy scalars to be indexed in the 
 same number of limited ways, that 0-d arrays support.
 
 For example, 0-d arrays can be indexed with
 
 * Boolean masks
 * Ellipses x[...]  and x[..., newaxis]
 * Empty tuple x[()]
 
 I think that numpy scalars should also be indexable in these particular 
 cases as well (read-only of course,  i.e. no setting of the value would 
 be possible).
 
 This is an easy change to implement, and I don't think it would cause 
 any backward compatibility issues.
 
 Any opinions from the list?
 
 
 Best regards,
 
 -Travis O.

Travis,

You have been getting mostly objections so far; maybe it would help if 
you gave a simple specific example of how your proposal would simplify code.

Eric
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Matching 0-d arrays and NumPy scalars

2008-02-21 Thread Alan G Isaac
On Thu, 21 Feb 2008, Konrad Hinsen apparently wrote:
 A float64 array is thus a container of float64 values. 

Well ... ok::

 x = N.array([1,2],dtype='float')
 x0 = x[0]
 type(x0)
type 'numpy.float64'


So a float64 value is whatever a numpy.float64 is,
and that is part of what is under discussion.
So it seems to me.

If so, then expected behavior and use cases seem relevant.

Alan

PS I agree that the posted matrix behavior is weird.
For this and other reasons I think it hurts the matrix 
object, and I have requested that it change ...



___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Matching 0-d arrays and NumPy scalars

2008-02-21 Thread Charles R Harris
On Thu, Feb 21, 2008 at 12:30 PM, Travis E. Oliphant [EMAIL PROTECTED]
wrote:


  Travis,
 
  You have been getting mostly objections so far;
 I wouldn't characterize it that way, but yes 2 people have pushed back a
 bit, although one not directly speaking to the proposed behavior.


I need to think about it a lot more, but my initial reaction is also
negative. On general principle, I think scalars should be different from
arrays. Perhaps you could give some concrete examples of why you want the
new behavior? Perhaps there will be other approaches that would achieve the
same end.

Chuck
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Matching 0-d arrays and NumPy scalars

2008-02-21 Thread Stefan van der Walt
On Thu, Feb 21, 2008 at 12:08:32PM -0500, Alan G Isaac wrote:
 On Thu, 21 Feb 2008, Konrad Hinsen apparently wrote:
 
  What I see as more fundamental is the behaviour of Python container 
  objects (lists, sets, etc.). If you add an object to a container and 
  then access it as an element of the container, you get the original 
  object (or something that behaves like the original object) without 
  any trace of the container itself. 
 
 I am not a CS type, but your statement seems related to
 a matrix behavior that I find bothersome and unnatural::
 
  M = N.mat('1 2;3 4')
  M[0]
 matrix([[1, 2]])
  M[0][0]
 matrix([[1, 2]])

This is exactly what I would expect for matrices: M[0] is the first
row of the matrix.  Note that you don't see this behaviour for
ndarrays, since those don't insist on having a minimum of
2-dimensions.

In [2]: x = np.arange(12).reshape((3,4))

In [3]: x
Out[3]: 
array([[ 0,  1,  2,  3],
   [ 4,  5,  6,  7],
   [ 8,  9, 10, 11]])

In [4]: x[0][0]
Out[4]: 0

In [5]: x[0]
Out[5]: array([0, 1, 2, 3])

Regards
Stefan
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Matching 0-d arrays and NumPy scalars

2008-02-21 Thread Stefan van der Walt
Hi Travis,

On Wed, Feb 20, 2008 at 10:14:07PM -0600, Travis E. Oliphant wrote:
 In writing some generic code, I've encountered situations where it would 
 reduce code complexity to allow NumPy scalars to be indexed in the 
 same number of limited ways, that 0-d arrays support.


 For example, 0-d arrays can be indexed with
 
 * Boolean masks

I've tried to use this before, but an IndexError (0-d arrays can't be
indexed) is raised.

 * Ellipses x[...]  and x[..., newaxis]

This, especially, seems like it could be very useful.

 This is an easy change to implement, and I don't think it would cause 
 any backward compatibility issues.
 
 Any opinions from the list?

This is maybe a fairly esoteric use case, but one I can imagine coming
across.  I'm in favour of implementing the change.

Could I ask that we also consider implementing len() for 0-d arrays?
numpy.asarray returns those as-is, and I would like to be able to
handle them just as I do any other 1-dimensional array.  I don't know
if a length of 1 would be valid, given a shape of (), but there must
be some consistent way of handling them.

Regards
Stefan
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Matching 0-d arrays and NumPy scalars

2008-02-21 Thread Anne Archibald
On 21/02/2008, Stefan van der Walt [EMAIL PROTECTED] wrote:

  Could I ask that we also consider implementing len() for 0-d arrays?
  numpy.asarray returns those as-is, and I would like to be able to
  handle them just as I do any other 1-dimensional array.  I don't know
  if a length of 1 would be valid, given a shape of (), but there must
  be some consistent way of handling them.

Well, if the length of an array is the product of all its sizes, the
product of no things is customarily defined to be one... whether that
is actually a useful value is another question.

Anne
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] matrix wart

2008-02-21 Thread Alan G Isaac
 On Thu, Feb 21, 2008 at 12:08:32PM -0500, Alan G Isaac wrote:
 a matrix behavior that I find bothersome and unnatural::

  M = N.mat('1 2;3 4')
  M[0]
 matrix([[1, 2]])
  M[0][0]
 matrix([[1, 2]])


On Fri, 22 Feb 2008, Stefan van der Walt apparently wrote:
 This is exactly what I would expect for matrices: M[0] is 
 the first row of the matrix.

Define what first row means!
There is no standard definition that says this is means the
**submatrix** that can be created from the first row.
Someone once pointed out on this list that one might 
consider a matrix to be a container of 1d vectors.  For NumPy, 
however, it is natural that it be a container of 1d arrays.
(See the discussion for the distinction.)

Imagine if a 2d array behaved this way.  Ugh!
Note that it too is 2d; you could have the same 
expectation based on its 2d-ness.  Why don't you?

You expect this matrix behavior only from experience with 
it, which is why I expect it too, while hating it. It is 
not what new users will expect and also not desirable.
As Konrad noted, it is very odd behavior to treat a matrix 
as a container of matrices.  You can only expect this 
behavior by learning to expect it (by use), which is 
undesirable.

Nobody has objected to returning matrices when getitem is 
fed multiple arguments: these are naturally interpreted as 
requests for submatrices.  M[0][0] and M[:1,:1] are very 
different kinds of requests: the first should return the 0,0
element but does not, while M[0,0] does!  Bizarre!
How to guess??  If you teach, do your students expect this 
behavior?  Mine don't!

This is a wart.

The example really speaks for itself. Since Konrad is an 
extremely experienced user/developer, his reaction should 
speak volumes.

Cheers,
Alan Isaac



___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] change memmap.sync function

2008-02-21 Thread Christopher Burns
Would anyone oppose deprecating the memmap.sync function and replacing
it with memmap.flush?  This would match python's mmap module, and I
think be more intuitive.

-- 
Christopher Burns, Software Engineer
Computational Infrastructure for Research Labs
10 Giannini Hall, UC Berkeley
phone: 510.643.4014
http://cirl.berkeley.edu/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Matching 0-d arrays and NumPy scalars

2008-02-21 Thread Eric Firing
Travis E. Oliphant wrote:
 Travis,

 You have been getting mostly objections so far;
 I wouldn't characterize it that way, but yes 2 people have pushed back a 
 bit, although one not directly speaking to the proposed behavior.  
 
 The issue is that [] notation does more than just select from a 
 container for NumPy arrays.   In particular, it is used to reshape an 
 array to more dimensions:  [..., newaxis]
 
 A common pattern is to reduce over a dimension and then re-shape the 
 result so that it can be combined with the un-reduced object.  
 Broadcasting makes this work if the dimension being reduced along is the 
 first dimension.  But, broadcasting is not enough if you want the 
 reduction dimension to be arbitrary:
 
 Thus,
 
 y = add.reduce(x, axis=-1)  produces an N-1 array if x is 2-d and a 
 numpy scalar if x is 1-d.

Why does it produce a scalar instead of a 0-d array?  Wouldn't the 
latter take care of your use case, and be consistent with the action of 
reduce in removing one dimension?

I'm not opposed to your suggested change--just trying to understand it. 
I'm certainly sympathetic to your use case, below. I dimly recall 
extensive and confusing (to me) discussions of numpy scalars versus 0-d 
arrays during your heroic push to make numpy gel, and I suspect the 
answer is somewhere back in those discussions.

Eric

 
 Suppose y needs to be subtracted from x. 
 
 If x is 2-d, then
 
   x - y[...,newaxis]
 
 is the needed code. But, if x is 1-d,  then
 
   x - y[..., newaxis]
 
 returns an error  and a check must be done to handle the case 
 separately.  If y[..., newaxis]  worked and produced a 1-d array when y 
 was a numpy scalar, this could be avoided.
 
 
 -Travis O.
 
 
 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Matching 0-d arrays and NumPy scalars

2008-02-21 Thread Konrad Hinsen
On 21.02.2008, at 18:40, Alan G Isaac wrote:

 x = N.array([1,2],dtype='float')
 x0 = x[0]
 type(x0)
 type 'numpy.float64'


 So a float64 value is whatever a numpy.float64 is,
 and that is part of what is under discussion.

numpy.float64 is a very recent invention. During the first decade of  
numerical arrays in Python (Numeric), typ(x0) was the standard Python  
float type. And even today, what you put into an array (via the array  
constructor or by assignment) is Python scalar objects, mostly int,  
float, and complex.

The reason for defining special types for the scalar elements of  
arrays was efficiency considerations. Python has only a single float  
type, there is no distinction between single and double precision.  
Extracting an array element would thus always yield a double  
precision float, and adding it to a single-precision array would  
yield a double precision result, meaning that it was extremely  
difficult to maintain single-precision storage across array  
arithmetic. For huge arrays, that was a serious problem.

However, the intention was always to have numpy's scalar objects  
behave as similarly as possible to Python scalars. Ideally,  
application code should not see a difference at all. This was largely  
successful, with the notable exception of the coercion problem that I  
mentioned a few mails ago.

Konrad.

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] matrix wart

2008-02-21 Thread Konrad Hinsen
On 22.02.2008, at 01:10, Alan G Isaac wrote:

 Someone once pointed out on this list that one might
 consider a matrix to be a container of 1d vectors.  For NumPy,
 however, it is natural that it be a container of 1d arrays.
 (See the discussion for the distinction.)

If I were to design a Pythonic implementation of the mathematical  
concept of a matrix, I'd implement three classes: Matrix,  
ColumnVector, and RowVector. It would work like this:

m = Matrix([[1, 2], [3, 4]])

m[0, :] --  ColumnVector([1, 3])
m[:, 0] -- RowVector([1, 2])

m[0,0] -- 1  # scalar

m.shape -- (2, 2)
m[0].shape -- (2,)

However, the matrix implementation in Numeric was inspired by Matlab,  
where everything is a matrix. But as I said before, Python is not  
Matlab.

Konrad.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion