Re: [Numpy-discussion] Matching 0-d arrays and NumPy scalars
In MATLAB, scalars are 1x1 arrays, and thus they can be indexed. There have been situations in my use of Numpy when I would have liked to index scalars to make my code more general. It's not a very pressing issue for me but it is an interesting issue. Whenever I index an array with a sequence or slice I'm guaranteed to get another array out. This consistency is nice. In [1]: A=numpy.random.rand(10) In [2]: A[range(0,1)] Out[2]: array([ 0.88109759]) In [3]: A[slice(0,1)] Out[3]: array([ 0.88109759]) In [3]: A[[0]] Out[3]: array([ 0.88109759]) However, when I index an array with an integer, I can get either a sequence or a scalar out. In [4]: c1=A[0] Out[4]: 0.88109759 In [5]: B=numpy.random.rand(5,5) In [5]: c2=B[0] Out[5]: array([ 0.81589633, 0.9762584 , 0.7231, 0.12700816, 0.40653243]) Although c1 and c2 were derived by integer-indexing two different arrays of doubles, one is a sequence and the other is a scalar. This lack of consistency might be confusing to some people, and I'd imagine it occasionally results in programming errors. Damian Travis E. Oliphant wrote: Hi everybody, In writing some generic code, I've encountered situations where it would reduce code complexity to allow NumPy scalars to be indexed in the same number of limited ways, that 0-d arrays support. For example, 0-d arrays can be indexed with * Boolean masks * Ellipses x[...] and x[..., newaxis] * Empty tuple x[()] I think that numpy scalars should also be indexable in these particular cases as well (read-only of course, i.e. no setting of the value would be possible). This is an easy change to implement, and I don't think it would cause any backward compatibility issues. Any opinions from the list? Best regards, -Travis O. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Matching 0-d arrays and NumPy scalars
Konrad Hinsen wrote: On 21.02.2008, at 08:41, Francesc Altet wrote: Well, it seems like a non-intrusive modification, but I like the scalars to remain un-indexable, mainly because it would be useful to raise an error when you are trying to index them. In fact, I thought that when you want a kind of scalar but indexable, you should use a 0-d array. I agree. In fact, I'd rather see NumPy scalars move towards Python scalars rather than towards NumPy arrays in behaviour. A good balance should be sought. I agree that improvements are needed, especially because much behavior is still just a side-effect of how things were implemented rather than specifically intentional. In particular, their nasty habit of coercing everything they are combined with into arrays is still my #1 source of compatibility problems with porting code from Numeric to NumPy. I end up converting NumPy scalars to Python scalars explicitly in lots of places. This bit, for example, comes from the fact that most of the math on scalars still uses ufuncs for their implementation. The numpy scalars could definitely use some improvements. However, I think my proposal for limited indexing capabilities should be considered separately from coercion behavior of NumPy scalars. NumPy scalars are intentionally different from Python scalars, and I see this difference growing due to where Python itself is going. For example, the int/long unification is going to change the ability for numpy.int to inherit from int. I could also forsee the Python float being an instance of a Decimal object or some other infinite precision float at some point which would prevent inheritance for the numpy.float object. The legitimate question is *how* different should they really be in each specific case. -Travis ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] finding eigenvectors etc
Hi all, How are you using the values? How significant are the differences? i am using these eigenvectors to do PCA on a set of images(of faces).I sort the eigenvectors in descending order of their eigenvalues and this is multiplied with the (orig data of some images viz a matrix)to obtain a facespace. I've dealt with similar issues a lot -- performing PCA on data where the dimensionality of the data is much greater than the number of data points. (Like images.) In this case, the maximum number of non-trivial eigenvectors of the covariance matrix of the data is min(dimension_of_data, number_of_data_points), so one always runs into the zero-eigenvalue problem; the matrix is thus always ill-conditioned, but that's not a problem in these cases. Nevertheless, if you've got (say) 100 images that are each 100x100 pixels, to do PCA in the naive way you need to make a 1x1 covariance matrix and then decompose it into 10 eigenvectors and values just to get out the 100 non-trivial ones. That's a lot of computation wasted calculating noise! Fortunately, there are better ways. One is to perform the SVD on the 100x1 data matrix. Let the centered (mean-subtracted) data matrix be D, then the SVD provides matrices U, S, and V'. IIRC, the eigenvalues of D'D (the covariance matrix of interest) are then packed along the first dimension of V', and the eigenvalues are the square of the values in S. But! There's an even faster way (from my testing). The trick is that instead of calculating the 1x1 outer covariance matrix D'D, or doing the SVD on D, one can calculate the 100x100 inner covariance matrix DD' and perform the eigen-decomposition thereon and then trivially transform those eigenvalues and vectors to the ones of the D'D matrix. This computation is often substantially faster than the SVD. Here's how it works: Let D, our re-centered data matrix, be of shape (n, k) -- that is, n data points in k dimensions. We know that D has a singular value decomposition D = USV' (no need to calculate the SVD though; just enough to know it exists). From this, we can rewrite the covariance matrices: D'D = VS'SV' DD' = USS'U' Now, from the SVD, we know that S'S and SS' are diagonal matrices, and V and U (and V' and U') form orthogonal bases. One way to write the eigen-decomposition of a matrix is A = QLQ', where Q is orthogonal and L is diagonal. Since the eigen-decomposition is unique (up to a permutation of the columns of Q and L), we know that V must therefore contain the eigenvectors of D'D in its columns, and U must contain the eigenvectors of DD' in its columns. This is the origin of the SVD recipe for that I gave above. Further, let S_hat, of shape (n, k) be the elementwise reciprocal of S (i.e. SS_hat = I of shape (m, n) and S_hatS = I of shape (n, m), where I is the identity matrix). Then, we can solve for U or V in terms of the other: V = D'US_hat' U = DVS_hat So, to get the eigenvectors and eigenvalues of D'D, we just calculate DD' and then apply the symmetric eigen-decomposition (symmetric version is faster, and DD' is symmetric) to get eigenvectors U and eigenvalues L. We know that L=SS', so S_hat = 1/sqrt(L) (where the sqrt is taken elementwise, of course). So, the eigenvectors we're looking for are: V = D'US_hat Then, the principal components (eigenvectors) are in the columns of V (packed along the second dimension of V). Fortunately, I've packaged this all up into a python module for PCA that takes care of this all. It's attached. Zach Pincus Postdoctoral Fellow, Lab of Dr. Frank Slack Molecular, Cellular and Developmental Biology Yale University # Copyright 2007 Zachary Pincus # # This is free software; you can redistribute it and/or modify # it under the terms of the Python License version 2.4 as published by the # Python Software Foundation. import numpy def pca(data, algorithm='eig'): pca(data) - mean, pcs, norm_pcs, variances, positions, norm_positions Perform Principal Components Analysis on a set of n data points in k dimensions. The data array must be of shape (n, k). This function returns the mean data point, the principal components (of shape (p, k), where p is the number pf principal components: p=min(n,k)), the normalized principal components, where each component is normalized by the data's standard deviation along that component (shape (p, k)), the variance each component represents (shape (p,)), the position of each data point along each component (shape (n, p)), and the position of each data point along each normalized component (shape (n, p)). The optional algorithm parameter can be either 'svd' to perform PCA with the singular value decomposition, or 'eig' to use a symmetric eigenvalue decomposition. Empirically, eig is faster on the datasets I have tested. data = numpy.asarray(data) mean = data.mean(axis = 0) centered = data -
Re: [Numpy-discussion] Matching 0-d arrays and NumPy scalars
While we are on the subject of indexing... I use xranges all over the place because I tend to loop over big data sets. Thus I try avoid to avoid allocating large chunks of memory unnecessarily with range. While I try to be careful not to let xranges propagate to the ndarray's [] operator, there have been a few times when I've made a mistake. Is there any reason why adding support for xrange indexing would be a bad thing to do? All one needs to do is convert the xrange to a slice object in __getitem__. I've written some simple code to do this conversion in Python (note that in C, one can access the start, end, and step of an xrange object very easily.) def xrange_to_slice(ind): Converts an xrange object to a slice object. retval = slice(None, None, None) if type(ind) == XRangeType: # Grab a string representation of the xrange object, which takes # any of the forms: xrange(a), xrange(a,b), xrange(a,b,s). # Break it apart into a, b, and s. sind = str(ind) xr_params = [int(s) for s in sind[(sind.find('(')+1):sind.find(')')].split(,)] retval = apply(slice, xr_params) else: raise TypeError(Index must be an xrange object!) #endif return retval On another note, I think it would be great if we added support for a find function, which takes a boolean array A, and returns the indices corresponding to True, but over A's flat view. In many cases, indexing with a boolean array is all one needs, making find unnecessary. However, I've encountered cases where computing the boolean array was computationally burdensome, the boolean arrays were large, and the result was needed many times throughout the broader computation. For many of my problems, storing away the flat index array uses a lot less memory than storing the boolean index arrays. I frequently define a function like def find(A): return numpy.where(A.flat)[0] Certainly, we'd need a find with more error checking, and one that handles the case when a list of booleans is passed (or a list of lists). Conceivably, one might try to index a non-flat array with the result of find. To deal with this, find could return a place holder object that the index operator checks for. Just an idea. -- I also think it'd be really useful to have a function that's like arange in that it supports floats/doubles, and also like xrange in that elements are only generated on demand. It could be implemented as a generator as shown below. def axrange(start, stop=None, step=1.0): if stop == None: stop = start start = 0.0 #endif (start, stop, step) = (numpy.float64(start), numpy.float64(stop), numpy.float64(step)) for i in xrange(0,numpy.ceil((stop-start)/step)): yield numpy.float64(start + step * i) #endfor Or, as a class, class axrangeiter: def __init__(self, rng): An iterator over an axrange object. self.rng = rng self.i = 0 def next(self): Returns the next float in the sequence. if self.i = len(self.rng): raise StopIteration() self.i += 1 return self.rng[self.i-1] class axrange: def __init__(self, *args): axrange(stop) axrange(start, stop, [step]) An axrange object is an iterable numerical sequence between start and stop. Similar to arange, there are n=ceil((stop-start)/step) elements in the sequence. Elements are generated on demand, which can be more memory efficient. if len(args) == 1: self.start = numpy.float64(0.0) self.stop = numpy.float64(args[0]) self.step = numpy.float64(1.0) elif len(args) == 2: self.start = numpy.float64(args[0]) self.stop = numpy.float64(args[1]) self.step = numpy.float64(1.0) elif len(args) == 3: self.start = numpy.float64(args[0]) self.stop = numpy.float64(args[1]) self.step = numpy.float64(args[2]) else: raise TypeError(axrange requires 3 arguments.) #endif self.len = max(int(numpy.ceil((self.stop-self.start)/self.step)),0) def __len__(self): return self.len def __getitem__(self, i): return numpy.float64(self.start + self.step * i) def __iter__(self): return axrangeiter(self) def __repr__(self): if self.start == 0.0 and self.step == 1.0: return axrange(%s) % str(self.stop) elif self.step == 1.0: return axrange(%s,%s) % (str(self.start), str(self.stop)) else: return axrange(%s,%s,%s) % (str(self.start), str(self.stop), str(self.step)) #endif Travis E. Oliphant wrote: Hi everybody, In writing some generic code, I've encountered situations where it would reduce code complexity to
Re: [Numpy-discussion] Matching 0-d arrays and NumPy scalars
Damian Eads wrote: While we are on the subject of indexing... I use xranges all over the place because I tend to loop over big data sets. Thus I try avoid to avoid allocating large chunks of memory unnecessarily with range. While I try to be careful not to let xranges propagate to the ndarray's [] operator, there have been a few times when I've made a mistake. Is there any reason why adding support for xrange indexing would be a bad thing to do? All one needs to do is convert the xrange to a slice object in __getitem__. I've written some simple code to do this conversion in Python (note that in C, one can access the start, end, and step of an xrange object very easily.) I think something like this could be supported. Basically, interpreting an xrange object as a slice object would be my presumed behavior. -Travis O. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Matching 0-d arrays and NumPy scalars
On Feb 21, 2008, at 16:03, Travis E. Oliphant wrote: However, I think my proposal for limited indexing capabilities should be considered separately from coercion behavior of NumPy scalars. NumPy scalars are intentionally different from Python scalars, and I see this difference growing due to where Python itself is going. For example, the int/long unification is going to change the ability for numpy.int to inherit from int. True, but this is almost an implementation detail. What I see as more fundamental is the behaviour of Python container objects (lists, sets, etc.). If you add an object to a container and then access it as an element of the container, you get the original object (or something that behaves like the original object) without any trace of the container itself. I don't see why arrays should behave differently from all the other Python container objects - certainly not because it would be rather easy to implement. NumPy has been inspired a lot by array languages like APL or Matlab. In those languages, everything is an array, and plain numbers that would be scalars elsewhere are considered 0-d arrays. Python is not an array language but an OO language with the more general concepts of containers, sequences, iterators, etc. Arrays are just one kind of container object among many others, so they should respect the common behaviours of containers. Konrad. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Matching 0-d arrays and NumPy scalars
On Thu, 21 Feb 2008, Konrad Hinsen apparently wrote: What I see as more fundamental is the behaviour of Python container objects (lists, sets, etc.). If you add an object to a container and then access it as an element of the container, you get the original object (or something that behaves like the original object) without any trace of the container itself. I am not a CS type, but your statement seems related to a matrix behavior that I find bothersome and unnatural:: M = N.mat('1 2;3 4') M[0] matrix([[1, 2]]) M[0][0] matrix([[1, 2]]) I do not think anyone has really defended this behavior, *but* the reply to me when I suggested that a matrix contains arrays and we should see that in its behavior was that, no, a matrix is a container of matrices so this is what you get. So a possible problem with your phrasing of the argument (from a non-CS, user point of view) is that it fails to address what is actually contained (as opposed to what you might wish were contained). Apologies if this proves OT. Cheers, Alan Isaac ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Matching 0-d arrays and NumPy scalars
On Feb 21, 2008, at 18:08, Alan G Isaac wrote: I do not think anyone has really defended this behavior, *but* the reply to me when I suggested that a matrix contains arrays and we should see that in its behavior was that, no, a matrix is a container of matrices so this is what you get. I can't say much about matrices in NumPy as I never used them, nor tried to understand them. The example you give looks weird to me. So a possible problem with your phrasing of the argument (from a non-CS, user point of view) is that it fails to address what is actually contained (as opposed to what you might wish were contained). Most Python container objects contain arbitrary objects. Arrays are an exception (the exception being justified by the enormous memory and performance gains) in that all its elements are necessarily of identical type. A float64 array is thus a container of float64 values. BTW, I am not a CS type either, my background is in physics. I see myself on the user side as well. Konrad. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Matching 0-d arrays and NumPy scalars
Travis E. Oliphant wrote: Hi everybody, In writing some generic code, I've encountered situations where it would reduce code complexity to allow NumPy scalars to be indexed in the same number of limited ways, that 0-d arrays support. For example, 0-d arrays can be indexed with * Boolean masks * Ellipses x[...] and x[..., newaxis] * Empty tuple x[()] I think that numpy scalars should also be indexable in these particular cases as well (read-only of course, i.e. no setting of the value would be possible). This is an easy change to implement, and I don't think it would cause any backward compatibility issues. Any opinions from the list? Best regards, -Travis O. Travis, You have been getting mostly objections so far; maybe it would help if you gave a simple specific example of how your proposal would simplify code. Eric ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Matching 0-d arrays and NumPy scalars
On Thu, 21 Feb 2008, Konrad Hinsen apparently wrote: A float64 array is thus a container of float64 values. Well ... ok:: x = N.array([1,2],dtype='float') x0 = x[0] type(x0) type 'numpy.float64' So a float64 value is whatever a numpy.float64 is, and that is part of what is under discussion. So it seems to me. If so, then expected behavior and use cases seem relevant. Alan PS I agree that the posted matrix behavior is weird. For this and other reasons I think it hurts the matrix object, and I have requested that it change ... ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Matching 0-d arrays and NumPy scalars
On Thu, Feb 21, 2008 at 12:30 PM, Travis E. Oliphant [EMAIL PROTECTED] wrote: Travis, You have been getting mostly objections so far; I wouldn't characterize it that way, but yes 2 people have pushed back a bit, although one not directly speaking to the proposed behavior. I need to think about it a lot more, but my initial reaction is also negative. On general principle, I think scalars should be different from arrays. Perhaps you could give some concrete examples of why you want the new behavior? Perhaps there will be other approaches that would achieve the same end. Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Matching 0-d arrays and NumPy scalars
On Thu, Feb 21, 2008 at 12:08:32PM -0500, Alan G Isaac wrote: On Thu, 21 Feb 2008, Konrad Hinsen apparently wrote: What I see as more fundamental is the behaviour of Python container objects (lists, sets, etc.). If you add an object to a container and then access it as an element of the container, you get the original object (or something that behaves like the original object) without any trace of the container itself. I am not a CS type, but your statement seems related to a matrix behavior that I find bothersome and unnatural:: M = N.mat('1 2;3 4') M[0] matrix([[1, 2]]) M[0][0] matrix([[1, 2]]) This is exactly what I would expect for matrices: M[0] is the first row of the matrix. Note that you don't see this behaviour for ndarrays, since those don't insist on having a minimum of 2-dimensions. In [2]: x = np.arange(12).reshape((3,4)) In [3]: x Out[3]: array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]]) In [4]: x[0][0] Out[4]: 0 In [5]: x[0] Out[5]: array([0, 1, 2, 3]) Regards Stefan ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Matching 0-d arrays and NumPy scalars
Hi Travis, On Wed, Feb 20, 2008 at 10:14:07PM -0600, Travis E. Oliphant wrote: In writing some generic code, I've encountered situations where it would reduce code complexity to allow NumPy scalars to be indexed in the same number of limited ways, that 0-d arrays support. For example, 0-d arrays can be indexed with * Boolean masks I've tried to use this before, but an IndexError (0-d arrays can't be indexed) is raised. * Ellipses x[...] and x[..., newaxis] This, especially, seems like it could be very useful. This is an easy change to implement, and I don't think it would cause any backward compatibility issues. Any opinions from the list? This is maybe a fairly esoteric use case, but one I can imagine coming across. I'm in favour of implementing the change. Could I ask that we also consider implementing len() for 0-d arrays? numpy.asarray returns those as-is, and I would like to be able to handle them just as I do any other 1-dimensional array. I don't know if a length of 1 would be valid, given a shape of (), but there must be some consistent way of handling them. Regards Stefan ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Matching 0-d arrays and NumPy scalars
On 21/02/2008, Stefan van der Walt [EMAIL PROTECTED] wrote: Could I ask that we also consider implementing len() for 0-d arrays? numpy.asarray returns those as-is, and I would like to be able to handle them just as I do any other 1-dimensional array. I don't know if a length of 1 would be valid, given a shape of (), but there must be some consistent way of handling them. Well, if the length of an array is the product of all its sizes, the product of no things is customarily defined to be one... whether that is actually a useful value is another question. Anne ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] matrix wart
On Thu, Feb 21, 2008 at 12:08:32PM -0500, Alan G Isaac wrote: a matrix behavior that I find bothersome and unnatural:: M = N.mat('1 2;3 4') M[0] matrix([[1, 2]]) M[0][0] matrix([[1, 2]]) On Fri, 22 Feb 2008, Stefan van der Walt apparently wrote: This is exactly what I would expect for matrices: M[0] is the first row of the matrix. Define what first row means! There is no standard definition that says this is means the **submatrix** that can be created from the first row. Someone once pointed out on this list that one might consider a matrix to be a container of 1d vectors. For NumPy, however, it is natural that it be a container of 1d arrays. (See the discussion for the distinction.) Imagine if a 2d array behaved this way. Ugh! Note that it too is 2d; you could have the same expectation based on its 2d-ness. Why don't you? You expect this matrix behavior only from experience with it, which is why I expect it too, while hating it. It is not what new users will expect and also not desirable. As Konrad noted, it is very odd behavior to treat a matrix as a container of matrices. You can only expect this behavior by learning to expect it (by use), which is undesirable. Nobody has objected to returning matrices when getitem is fed multiple arguments: these are naturally interpreted as requests for submatrices. M[0][0] and M[:1,:1] are very different kinds of requests: the first should return the 0,0 element but does not, while M[0,0] does! Bizarre! How to guess?? If you teach, do your students expect this behavior? Mine don't! This is a wart. The example really speaks for itself. Since Konrad is an extremely experienced user/developer, his reaction should speak volumes. Cheers, Alan Isaac ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] change memmap.sync function
Would anyone oppose deprecating the memmap.sync function and replacing it with memmap.flush? This would match python's mmap module, and I think be more intuitive. -- Christopher Burns, Software Engineer Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Matching 0-d arrays and NumPy scalars
Travis E. Oliphant wrote: Travis, You have been getting mostly objections so far; I wouldn't characterize it that way, but yes 2 people have pushed back a bit, although one not directly speaking to the proposed behavior. The issue is that [] notation does more than just select from a container for NumPy arrays. In particular, it is used to reshape an array to more dimensions: [..., newaxis] A common pattern is to reduce over a dimension and then re-shape the result so that it can be combined with the un-reduced object. Broadcasting makes this work if the dimension being reduced along is the first dimension. But, broadcasting is not enough if you want the reduction dimension to be arbitrary: Thus, y = add.reduce(x, axis=-1) produces an N-1 array if x is 2-d and a numpy scalar if x is 1-d. Why does it produce a scalar instead of a 0-d array? Wouldn't the latter take care of your use case, and be consistent with the action of reduce in removing one dimension? I'm not opposed to your suggested change--just trying to understand it. I'm certainly sympathetic to your use case, below. I dimly recall extensive and confusing (to me) discussions of numpy scalars versus 0-d arrays during your heroic push to make numpy gel, and I suspect the answer is somewhere back in those discussions. Eric Suppose y needs to be subtracted from x. If x is 2-d, then x - y[...,newaxis] is the needed code. But, if x is 1-d, then x - y[..., newaxis] returns an error and a check must be done to handle the case separately. If y[..., newaxis] worked and produced a 1-d array when y was a numpy scalar, this could be avoided. -Travis O. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Matching 0-d arrays and NumPy scalars
On 21.02.2008, at 18:40, Alan G Isaac wrote: x = N.array([1,2],dtype='float') x0 = x[0] type(x0) type 'numpy.float64' So a float64 value is whatever a numpy.float64 is, and that is part of what is under discussion. numpy.float64 is a very recent invention. During the first decade of numerical arrays in Python (Numeric), typ(x0) was the standard Python float type. And even today, what you put into an array (via the array constructor or by assignment) is Python scalar objects, mostly int, float, and complex. The reason for defining special types for the scalar elements of arrays was efficiency considerations. Python has only a single float type, there is no distinction between single and double precision. Extracting an array element would thus always yield a double precision float, and adding it to a single-precision array would yield a double precision result, meaning that it was extremely difficult to maintain single-precision storage across array arithmetic. For huge arrays, that was a serious problem. However, the intention was always to have numpy's scalar objects behave as similarly as possible to Python scalars. Ideally, application code should not see a difference at all. This was largely successful, with the notable exception of the coercion problem that I mentioned a few mails ago. Konrad. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] matrix wart
On 22.02.2008, at 01:10, Alan G Isaac wrote: Someone once pointed out on this list that one might consider a matrix to be a container of 1d vectors. For NumPy, however, it is natural that it be a container of 1d arrays. (See the discussion for the distinction.) If I were to design a Pythonic implementation of the mathematical concept of a matrix, I'd implement three classes: Matrix, ColumnVector, and RowVector. It would work like this: m = Matrix([[1, 2], [3, 4]]) m[0, :] -- ColumnVector([1, 3]) m[:, 0] -- RowVector([1, 2]) m[0,0] -- 1 # scalar m.shape -- (2, 2) m[0].shape -- (2,) However, the matrix implementation in Numeric was inspired by Matlab, where everything is a matrix. But as I said before, Python is not Matlab. Konrad. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion