Re: [Numpy-discussion] expected a single-segment buffer object

2008-07-10 Thread Anne Archibald
2008/7/9 Robert Kern [EMAIL PROTECTED]:

 Because that's just what a buffer= argument *is*. It is not a place
 for presenting the starting pointer to exotically-strided memory. Use
 __array_interface__s to describe the full range of representable
 memory. See below.

Aha! Is this stuff documented somewhere?

 I was about a week ahead of you. See numpy/lib/stride_tricks.py in the trunk.

Nice! Unfortunately it can't quite do what I want... for the linear
algebra I need something that can broadcast all but certain axes. For
example, take an array of matrices and an array of vectors. The
array axes need broadcasting, but you can't broadcast on all axes
without (incorrectly) turning the vector into a matrix. I've written a
(messy) implementation, but the corner cases are giving me headaches.
I'll let you know when I have something that works.

Thanks,
Anne
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] expected a single-segment buffer object

2008-07-10 Thread Charles R Harris
On Thu, Jul 10, 2008 at 9:33 AM, Anne Archibald [EMAIL PROTECTED]
wrote:

 2008/7/9 Robert Kern [EMAIL PROTECTED]:

  Because that's just what a buffer= argument *is*. It is not a place
  for presenting the starting pointer to exotically-strided memory. Use
  __array_interface__s to describe the full range of representable
  memory. See below.

 Aha! Is this stuff documented somewhere?

  I was about a week ahead of you. See numpy/lib/stride_tricks.py in the
 trunk.

 Nice! Unfortunately it can't quite do what I want... for the linear
 algebra I need something that can broadcast all but certain axes. For
 example, take an array of matrices and an array of vectors. The
 array axes need broadcasting, but you can't broadcast on all axes
 without (incorrectly) turning the vector into a matrix. I've written a
 (messy) implementation, but the corner cases are giving me headaches.
 I'll let you know when I have something that works.


I think something like a matrix/vector dtype would be another way to go for
stacks of matrices and vectors. It would have to be a user defined type to
fit into the current type hierarchy for ufuncs, but I think the base
machinery is there with the generic inner loops.

Chuck
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] expected a single-segment buffer object

2008-07-10 Thread Stéfan van der Walt
2008/7/10 Robert Kern [EMAIL PROTECTED]:
 I was about a week ahead of you. See numpy/lib/stride_tricks.py in the trunk.

Robert, this is fantastic!  I think people are going to enjoy your
talk at SciPy'08.  If you want, we could also tutor this in the
advanced NumPy session.

Cheers
Stéfan
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] expected a single-segment buffer object

2008-07-10 Thread Anne Archibald
2008/7/10 Charles R Harris [EMAIL PROTECTED]:

 On Thu, Jul 10, 2008 at 9:33 AM, Anne Archibald [EMAIL PROTECTED]
 wrote:

 2008/7/9 Robert Kern [EMAIL PROTECTED]:

  Because that's just what a buffer= argument *is*. It is not a place
  for presenting the starting pointer to exotically-strided memory. Use
  __array_interface__s to describe the full range of representable
  memory. See below.

 Aha! Is this stuff documented somewhere?

  I was about a week ahead of you. See numpy/lib/stride_tricks.py in the
  trunk.

 Nice! Unfortunately it can't quite do what I want... for the linear
 algebra I need something that can broadcast all but certain axes. For
 example, take an array of matrices and an array of vectors. The
 array axes need broadcasting, but you can't broadcast on all axes
 without (incorrectly) turning the vector into a matrix. I've written a
 (messy) implementation, but the corner cases are giving me headaches.
 I'll let you know when I have something that works.

 I think something like a matrix/vector dtype would be another way to go for
 stacks of matrices and vectors. It would have to be a user defined type to
 fit into the current type hierarchy for ufuncs, but I think the base
 machinery is there with the generic inner loops.

There's definitely room for discussion about how such a linear algebra
system should ultimately work. In the short term, though, I think it's
valuable to create a prototype system, even if much of the looping has
to be in python, just to figure out how it should look to the user.

For example, I'm not sure whether a matrix/vector dtype would make
easy things like indexing operations - fancy indexing spanning both
array and matrix axes, for example. dtypes can also be a little
impenetrable to use, so even if this is how elementwise linear algebra
is ultimately implemented, we may want to provide a different user
frontend.

My idea for a first sketch was simply to provide (for example)
np.elementwise_linalg.dot_mm, that interprets its arguments both as
arrays of matrices and yields an array of matrices. A second sketch
might be a subclass MatrixArray, which could serve much as Matrix does
now, to tell various functions which axes to do linear algebra over
and which axes to iterate over.

There's also the question of how to make these efficient; one could of
course write a C wrapper for each linear algebra function that simply
iterates, but your suggestion of using the ufunc machinery with a
matrix dtype might be better. Or, if cython acquires sensible
polymorphism over numpy dtypes, a cython wrapper might be the way to
go. But I think establishing what it should look like to users is a
good first step, and I think that's best done with sample
implementations.

Anne
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] expected a single-segment buffer object

2008-07-10 Thread Charles R Harris
On Thu, Jul 10, 2008 at 2:25 PM, Anne Archibald [EMAIL PROTECTED]
wrote:

 2008/7/10 Charles R Harris [EMAIL PROTECTED]:

  On Thu, Jul 10, 2008 at 9:33 AM, Anne Archibald 
 [EMAIL PROTECTED]
  wrote:
 
  2008/7/9 Robert Kern [EMAIL PROTECTED]:
 
   Because that's just what a buffer= argument *is*. It is not a place
   for presenting the starting pointer to exotically-strided memory. Use
   __array_interface__s to describe the full range of representable
   memory. See below.
 
  Aha! Is this stuff documented somewhere?
 
   I was about a week ahead of you. See numpy/lib/stride_tricks.py in the
   trunk.
 
  Nice! Unfortunately it can't quite do what I want... for the linear
  algebra I need something that can broadcast all but certain axes. For
  example, take an array of matrices and an array of vectors. The
  array axes need broadcasting, but you can't broadcast on all axes
  without (incorrectly) turning the vector into a matrix. I've written a
  (messy) implementation, but the corner cases are giving me headaches.
  I'll let you know when I have something that works.
 
  I think something like a matrix/vector dtype would be another way to go
 for
  stacks of matrices and vectors. It would have to be a user defined type
 to
  fit into the current type hierarchy for ufuncs, but I think the base
  machinery is there with the generic inner loops.

 There's definitely room for discussion about how such a linear algebra
 system should ultimately work. In the short term, though, I think it's
 valuable to create a prototype system, even if much of the looping has
 to be in python, just to figure out how it should look to the user.

 For example, I'm not sure whether a matrix/vector dtype would make
 easy things like indexing operations - fancy indexing spanning both
 array and matrix axes,


True enough. And I would expect the ufunc approach to require the sub
matrices to be contiguous. In any case, ufuncs aren't ready for this yet, in
fact they aren't ready for string types or other numeric types, so there is
a lot of work to be done just at that level.

for example. dtypes can also be a little
 impenetrable to use, so even if this is how elementwise linear algebra
 is ultimately implemented, we may want to provide a different user
 frontend.

 My idea for a first sketch was simply to provide (for example)
 np.elementwise_linalg.dot_mm, that interprets its arguments both as
 arrays of matrices and yields an array of matrices. A second sketch
 might be a subclass MatrixArray, which could serve much as Matrix does
 now, to tell various functions which axes to do linear algebra over
 and which axes to iterate over.


Definitely needs doing. It's hard to see how things work out in practice
without some experimentation.


 There's also the question of how to make these efficient; one could of
 course write a C wrapper for each linear algebra function that simply
 iterates, but your suggestion of using the ufunc machinery with a
 matrix dtype might be better. Or, if cython acquires sensible
 polymorphism over numpy dtypes, a cython wrapper might be the way to
 go. But I think establishing what it should look like to users is a
 good first step, and I think that's best done with sample
 implementations.


Amen.

Chuck
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] expected a single-segment buffer object

2008-07-10 Thread Robert Kern
On Thu, Jul 10, 2008 at 10:33, Anne Archibald [EMAIL PROTECTED] wrote:
 2008/7/9 Robert Kern [EMAIL PROTECTED]:

 Because that's just what a buffer= argument *is*. It is not a place
 for presenting the starting pointer to exotically-strided memory. Use
 __array_interface__s to describe the full range of representable
 memory. See below.

 Aha! Is this stuff documented somewhere?

_The Guide to Numpy_, section 3.1.4.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
 -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] expected a single-segment buffer object

2008-07-09 Thread Robert Kern
On Wed, Jul 9, 2008 at 18:55, Anne Archibald [EMAIL PROTECTED] wrote:
 Hi,

 When trying to construct an ndarray, I sometimes run into the
 more-or-less mystifying error expected a single-segment buffer
 object:

 Out[54]: (0, 16, 8)
 In [55]: A=np.zeros(2); A=A[np.newaxis,...];
 np.ndarray(strides=A.strides,shape=A.shape,buffer=A,dtype=A.dtype)
 ---
 type 'exceptions.TypeError' Traceback (most recent call last)

 /home/peridot/ipython console in module()

 type 'exceptions.TypeError': expected a single-segment buffer object

 In [56]: A.strides
 Out[56]: (0, 8)

 That is, when I try to construct an ndarray based on an array with a
 zero stride, I get this mystifying error. Zero-strided arrays appear
 naturally when one uses newaxis, but they are valuable in their own
 right (for example for broadcasting purposes). So it's a bit awkward
 to have this error appearing when one tries to feed them to
 ndarray.__new__ as a buffer. I can, I think, work around it by
 removing all axes with stride zero:

 def bufferize(A):
   idx = []
   for v in A.strides:
   if v==0:
   idx.append(0)
   else:
   idx.append(slice(None,None,None))
   return A[tuple(idx)]

 Is there any reason for this restriction?

Yes, the buffer interface, at least the subset that ndarray()
consumes, requires that all of the data be contiguous in memory.
array_as_buffer() checks for that using PyArray_ISONE_SEGMENT(), which
looks like this:

#define PyArray_ISONESEGMENT(m) (PyArray_NDIM(m) == 0 ||  \
 PyArray_CHKFLAGS(m, NPY_CONTIGUOUS) ||   \
 PyArray_CHKFLAGS(m, NPY_FORTRAN))

Trying to get a buffer object from anything that is neither C- or
Fortran-contiguous will fail. E.g.

In [1]: from numpy import *

In [2]: A = arange(10)

In [3]: B = A[::2]

In [4]: ndarray(strides=B.strides, shape=B.shape, buffer=B, dtype=B.dtype)
---
TypeError Traceback (most recent call last)

/Users/rkern/today/ipython console in module()

TypeError: expected a single-segment buffer object


What is the use case, here? One rarely has to use the ndarray
constructor by itself. For example, the result you seem to want from
the call you make above can be done just fine with .view().

In [8]: C = B.view(ndarray)

In [9]: C
Out[9]: array([0, 2, 4, 6, 8])

In [10]: B
Out[10]: array([0, 2, 4, 6, 8])

In [11]: C is B
Out[11]: False

In [12]: B[0] = 10

In [13]: C
Out[13]: array([10,  2,  4,  6,  8])

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
 -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] expected a single-segment buffer object

2008-07-09 Thread Anne Archibald
2008/7/9 Robert Kern [EMAIL PROTECTED]:

 Yes, the buffer interface, at least the subset that ndarray()
 consumes, requires that all of the data be contiguous in memory.
 array_as_buffer() checks for that using PyArray_ISONE_SEGMENT(), which
 looks like this:

 #define PyArray_ISONESEGMENT(m) (PyArray_NDIM(m) == 0 ||  
 \
 PyArray_CHKFLAGS(m, NPY_CONTIGUOUS) ||   \
 PyArray_CHKFLAGS(m, NPY_FORTRAN))

 Trying to get a buffer object from anything that is neither C- or
 Fortran-contiguous will fail. E.g.

 In [1]: from numpy import *

 In [2]: A = arange(10)

 In [3]: B = A[::2]

 In [4]: ndarray(strides=B.strides, shape=B.shape, buffer=B, dtype=B.dtype)
 ---
 TypeError Traceback (most recent call last)

 /Users/rkern/today/ipython console in module()

 TypeError: expected a single-segment buffer object

Is this really necessary? What does making this restriction gain? It
certainly means that many arrays whose storage is a contiguous block
of memory can still not be used (just permute the axes of a 3d array,
say; it may even be possible for an array to be in C contiguous order
but for the flag not to be set), but how is one to construct exotic
slices of an array that is strided in memory? (The real part of a
complex array, say.)

I suppose one could follow the linked list of .bases up to the
original ndarray, which should normally be C- or Fortran-contiguous,
then work out the offset, but even this may not always work: what if
the original array was constructed with non-C-contiguous strides from
some preexisting buffer?

If the concern is that this allows users to shoot themselves in the
foot, it's worth noting that even with the current setup you can
easily fabricate strides and shapes that go outside the allocated part
of memory.

 What is the use case, here? One rarely has to use the ndarray
 constructor by itself. For example, the result you seem to want from
 the call you make above can be done just fine with .view().

I was presenting a simple example. I was actually trying to use
zero-strided arrays to implement broadcasting.  The code was rather
long, but essentially what it was meant to do was generate a view of
an array in which an axis of length one had been replaced by an axis
of length m with stride zero. (The point of all this was to create a
class like vectorize that was suitable for use on, for example,
np.linalg.inv().) But I also ran into this problem while writing
segmentaxis.py, the code to produce a matrix of sliding windows.
(See http://www.scipy.org/Cookbook/SegmentAxis .) There I caught the
exception and copied the array (unnecessarily) if this came up.

Anne
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] expected a single-segment buffer object

2008-07-09 Thread Robert Kern
On Wed, Jul 9, 2008 at 21:29, Anne Archibald [EMAIL PROTECTED] wrote:
 2008/7/9 Robert Kern [EMAIL PROTECTED]:

 Yes, the buffer interface, at least the subset that ndarray()
 consumes, requires that all of the data be contiguous in memory.
 array_as_buffer() checks for that using PyArray_ISONE_SEGMENT(), which
 looks like this:

 #define PyArray_ISONESEGMENT(m) (PyArray_NDIM(m) == 0 || 
  \
 PyArray_CHKFLAGS(m, NPY_CONTIGUOUS) ||   
 \
 PyArray_CHKFLAGS(m, NPY_FORTRAN))

 Trying to get a buffer object from anything that is neither C- or
 Fortran-contiguous will fail. E.g.

 In [1]: from numpy import *

 In [2]: A = arange(10)

 In [3]: B = A[::2]

 In [4]: ndarray(strides=B.strides, shape=B.shape, buffer=B, dtype=B.dtype)
 ---
 TypeError Traceback (most recent call last)

 /Users/rkern/today/ipython console in module()

 TypeError: expected a single-segment buffer object

 Is this really necessary? What does making this restriction gain? It
 certainly means that many arrays whose storage is a contiguous block
 of memory can still not be used (just permute the axes of a 3d array,
 say; it may even be possible for an array to be in C contiguous order
 but for the flag not to be set), but how is one to construct exotic
 slices of an array that is strided in memory? (The real part of a
 complex array, say.)

Because that's just what a buffer= argument *is*. It is not a place
for presenting the starting pointer to exotically-strided memory. Use
__array_interface__s to describe the full range of representable
memory. See below.

 I suppose one could follow the linked list of .bases up to the
 original ndarray, which should normally be C- or Fortran-contiguous,
 then work out the offset, but even this may not always work: what if
 the original array was constructed with non-C-contiguous strides from
 some preexisting buffer?

 If the concern is that this allows users to shoot themselves in the
 foot, it's worth noting that even with the current setup you can
 easily fabricate strides and shapes that go outside the allocated part
 of memory.

 What is the use case, here? One rarely has to use the ndarray
 constructor by itself. For example, the result you seem to want from
 the call you make above can be done just fine with .view().

 I was presenting a simple example. I was actually trying to use
 zero-strided arrays to implement broadcasting.  The code was rather
 long, but essentially what it was meant to do was generate a view of
 an array in which an axis of length one had been replaced by an axis
 of length m with stride zero. (The point of all this was to create a
 class like vectorize that was suitable for use on, for example,
 np.linalg.inv().) But I also ran into this problem while writing
 segmentaxis.py, the code to produce a matrix of sliding windows.
 (See http://www.scipy.org/Cookbook/SegmentAxis .) There I caught the
 exception and copied the array (unnecessarily) if this came up.

I was about a week ahead of you. See numpy/lib/stride_tricks.py in the trunk.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
 -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion