Re: [Numpy-discussion] Fancy-indexing reorders output in corner cases?

2012-05-16 Thread Nathaniel Smith
On Tue, May 15, 2012 at 5:03 AM, Travis Oliphant tra...@continuum.io wrote:
 So, the behavior is actually quite predictable, it's just that in some common 
 cases it doesn't do what you would expect --- especially if you think that 
 [0,1] is the same as :2.   When I wrote this code to begin with I should 
 have raised an error and then worked in the cases that make sense.    This is 
 a good example of making the mistake of thinking that it's better to provide 
 something very general rather than just raise an error when an obvious and 
 clear solution is not available.

 There is the possibility that we could now raise an error in NumPy when this 
 situation is encountered because I strongly doubt anyone is actually relying 
 on the current behavior.    I would like to do this, actually, as soon as 
 possible.  Comments?

+1 from me.

It'd probably be a good idea to check for and deprecate any similarly
bizarre cases of mixed boolean indexing/slicing, mixed boolean/integer
indexing, etc.

- N
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fancy-indexing reorders output in corner cases?

2012-05-15 Thread Olivier Delalleau
2012/5/15 Travis Oliphant tra...@continuum.io


 On May 14, 2012, at 7:07 PM, Stéfan van der Walt wrote:

  Hi Zach
 
  On Mon, May 14, 2012 at 4:33 PM, Zachary Pincus zachary.pin...@yale.edu
 wrote:
  The below seems to be a bug, but perhaps it's unavoidably part of the
 indexing mechanism?
 
  It's easiest to show via example... note that using [0,1] to pull two
 columns out of the array gives the same shape as using :2 in the simple
 case, but when there's additional slicing happening, the shapes get
 transposed or something.
 
  When fancy indexing and slicing is mixed, the resulting shape is
  essentially unpredictable.  The correct way to do it is to only use
  fancy indexing, i.e. generate the indices of the sliced dimension as
  well.

 This is not quite accurate.   It is not unpredictable.  It is very
 predictable, but a bit (too) complicated in the most general case.  The
 problem occurs when you intermingle fancy indexing with slice notation
 (and for this purpose integer selection is considered fancy-indexing).
 While in simple cases you can think that [0,1] is equivalent to :2 --- it
 is not because fancy-indexing uses zip-based ideas instead of
 cross-product based ideas.

 The problem in general is how to make sense of something like

 a[:, :, in1, in2]

 If you keep fancy indexing to one side of the slice notation only, then
 you get what you expect.   The shape of the output will be the first two
 dimensions of a + the broadcasted shape of in1 and in2 (where integers are
 interpreted as fancy-index arrays).

 So, let's say a is (10,9,8,7)  and in1 is (3,4) and in2 is (4,)

 The shape of the output will be (10,9,3,4) filled with essentially
 a[:,:,i,j] = a[:,:,in1[i,j], in2[j]]

 What happens, though when you have

 a[:, in1 :, in2]?

 in1 and in2 are broadcasted together to create a two-dimensional
 sub-space that must fit somewhere.   Where should it go?   Should it
 replace in1 or in2?I.e. should the output be

 (10,3,4,8) or (10,8,3,4).

 To resolve this ambiguity, the code sends the (3,4) sub-space to the
 front of the dimensions and returns (3,4,10,8).   In retro-spect, the
 code should raise an error as I doubt anyone actually relies on this
 behavior, and then we could have done the right thing for situations like
 in1 being an integer which actually makes some sense and should not have
 been confused with the general case

 In this particular case you might also think that we could say the result
 should be (10,3,8,4) but there is no guarantee that the number of
 dimensions that should be appended by the fancy-indexing objects will be
 the same as the number of dimensions replaced.Again, this is how
 fancy-indexing combines with other fancy-indexing objects.

 So, the behavior is actually quite predictable, it's just that in some
 common cases it doesn't do what you would expect --- especially if you
 think that [0,1] is the same as :2.   When I wrote this code to begin
 with I should have raised an error and then worked in the cases that make
 sense.This is a good example of making the mistake of thinking that
 it's better to provide something very general rather than just raise an
 error when an obvious and clear solution is not available.

 There is the possibility that we could now raise an error in NumPy when
 this situation is encountered because I strongly doubt anyone is actually
 relying on the current behavior.I would like to do this, actually, as
 soon as possible.  Comments?


+1 to raise an error instead of an unintuitive behavior.

-=- Olivier
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fancy-indexing reorders output in corner cases?

2012-05-15 Thread Eric Firing
On 05/14/2012 06:03 PM, Travis Oliphant wrote:
 What happens, though when you have

 a[:, in1 :, in2]?

 in1 and in2 are broadcasted together to create a two-dimensional
 sub-space that must fit somewhere.   Where should it go?   Should
 it replace in1 or in2?I.e. should the output be

 (10,3,4,8) or (10,8,3,4).

 To resolve this ambiguity, the code sends the (3,4) sub-space to
 the front of the dimensions and returns (3,4,10,8).   In
 retro-spect, the code should raise an error as I doubt anyone
 actually relies on this behavior, and then we could have done the
 right thing for situations like in1 being an integer which actually
 makes some sense and should not have been confused with the general
 case

 In this particular case you might also think that we could say the
 result should be (10,3,8,4) but there is no guarantee that the number
 of dimensions that should be appended by the fancy-indexing objects
 will be the same as the number of dimensions replaced.Again, this
 is how fancy-indexing combines with other fancy-indexing objects.

 So, the behavior is actually quite predictable, it's just that in
 some common cases it doesn't do what you would expect --- especially
 if you think that [0,1] is the same as :2.   When I wrote this code
 to begin with I should have raised an error and then worked in the
 cases that make sense.This is a good example of making the
 mistake of thinking that it's better to provide something very
 general rather than just raise an error when an obvious and clear
 solution is not available.

 There is the possibility that we could now raise an error in NumPy
 when this situation is encountered because I strongly doubt anyone is
 actually relying on the current behavior.I would like to do this,
 actually, as soon as possible.  Comments?

Travis,

Good idea, especially if you can then make the integer case work as one 
might reasonably expect.  Keeping the present too-fancy capabilities can 
only cause continuing confusion.

Eric


 -Travis

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Fancy-indexing reorders output in corner cases?

2012-05-14 Thread Zachary Pincus
Hello all,

The below seems to be a bug, but perhaps it's unavoidably part of the indexing 
mechanism?

It's easiest to show via example... note that using [0,1] to pull two columns 
out of the array gives the same shape as using :2 in the simple case, but 
when there's additional slicing happening, the shapes get transposed or 
something.

In [2]: numpy.version.version # latest git version
Out[2]: '1.7.0.dev-3d4'

In [3]: d = numpy.empty((10, 9, 8, 7))

In [4]: d[:,:,:,[0,1]].shape
Out[4]: (10, 9, 8, 2)

In [5]: d[:,:,:,:2].shape
Out[5]: (10, 9, 8, 2)

In [6]: d[:,0,:,[0,1]].shape
Out[6]: (2, 10, 8)

In [7]: d[:,0,:,:2].shape
Out[7]: (10, 8, 2)

In [8]: d[0,:,:,[0,1]].shape
Out[8]: (2, 9, 8)

In [9]: d[0,:,:,:2].shape
Out[9]: (9, 8, 2)

Oddly, this error can appear/disappear depending on the position of the other 
axis sliced:
In [14]: d = numpy.empty((10, 9, 8))

In [15]: d[:,:,[0,1]].shape
Out[15]: (10, 9, 2)

In [16]: d[:,0,[0,1]].shape
Out[16]: (10, 2)

In [17]: d[0,:,[0,1]].shape
Out[17]: (2, 9)

This cannot be the expected behavior, right?
Zach

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fancy-indexing reorders output in corner cases?

2012-05-14 Thread Stéfan van der Walt
Hi Zach

On Mon, May 14, 2012 at 4:33 PM, Zachary Pincus zachary.pin...@yale.edu wrote:
 The below seems to be a bug, but perhaps it's unavoidably part of the 
 indexing mechanism?

 It's easiest to show via example... note that using [0,1] to pull two 
 columns out of the array gives the same shape as using :2 in the simple 
 case, but when there's additional slicing happening, the shapes get 
 transposed or something.

When fancy indexing and slicing is mixed, the resulting shape is
essentially unpredictable.  The correct way to do it is to only use
fancy indexing, i.e. generate the indices of the sliced dimension as
well.

Stéfan
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fancy-indexing reorders output in corner cases?

2012-05-14 Thread Zachary Pincus
 On Mon, May 14, 2012 at 4:33 PM, Zachary Pincus zachary.pin...@yale.edu 
 wrote:
 The below seems to be a bug, but perhaps it's unavoidably part of the 
 indexing mechanism?
 
 It's easiest to show via example... note that using [0,1] to pull two 
 columns out of the array gives the same shape as using :2 in the simple 
 case, but when there's additional slicing happening, the shapes get 
 transposed or something.
 
 When fancy indexing and slicing is mixed, the resulting shape is
 essentially unpredictable.

Aah, right -- this does come up on the list not infrequently, doesn't it. I'd 
always thought it was more exotic usages that raised these issues. Good to know.

  The correct way to do it is to only use
 fancy indexing, i.e. generate the indices of the sliced dimension as
 well.
 

Excellent -- thanks!


 Stéfan
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fancy-indexing reorders output in corner cases?

2012-05-14 Thread Travis Oliphant

On May 14, 2012, at 7:07 PM, Stéfan van der Walt wrote:

 Hi Zach
 
 On Mon, May 14, 2012 at 4:33 PM, Zachary Pincus zachary.pin...@yale.edu 
 wrote:
 The below seems to be a bug, but perhaps it's unavoidably part of the 
 indexing mechanism?
 
 It's easiest to show via example... note that using [0,1] to pull two 
 columns out of the array gives the same shape as using :2 in the simple 
 case, but when there's additional slicing happening, the shapes get 
 transposed or something.
 
 When fancy indexing and slicing is mixed, the resulting shape is
 essentially unpredictable.  The correct way to do it is to only use
 fancy indexing, i.e. generate the indices of the sliced dimension as
 well.

This is not quite accurate.   It is not unpredictable.  It is very predictable, 
but a bit (too) complicated in the most general case.  The problem occurs when 
you intermingle fancy indexing with slice notation (and for this purpose 
integer selection is considered fancy-indexing).   While in simple cases you 
can think that [0,1] is equivalent to :2 --- it is not because fancy-indexing 
uses zip-based ideas instead of cross-product based ideas.   

The problem in general is how to make sense of something like

a[:, :, in1, in2]   

If you keep fancy indexing to one side of the slice notation only, then you get 
what you expect.   The shape of the output will be the first two dimensions of 
a + the broadcasted shape of in1 and in2 (where integers are interpreted as 
fancy-index arrays). 

So, let's say a is (10,9,8,7)  and in1 is (3,4) and in2 is (4,)

The shape of the output will be (10,9,3,4) filled with essentially a[:,:,i,j] = 
a[:,:,in1[i,j], in2[j]]

What happens, though when you have

a[:, in1 :, in2]? 

in1 and in2 are broadcasted together to create a two-dimensional sub-space 
that must fit somewhere.   Where should it go?   Should it replace in1 or in2?  
  I.e. should the output be 

(10,3,4,8) or (10,8,3,4).  

To resolve this ambiguity, the code sends the (3,4) sub-space to the front of 
the dimensions and returns (3,4,10,8).   In retro-spect, the code should 
raise an error as I doubt anyone actually relies on this behavior, and then we 
could have done the right thing for situations like in1 being an integer 
which actually makes some sense and should not have been confused with the 
general case  

In this particular case you might also think that we could say the result 
should be (10,3,8,4) but there is no guarantee that the number of dimensions 
that should be appended by the fancy-indexing objects will be the same as the 
number of dimensions replaced.Again, this is how fancy-indexing combines 
with other fancy-indexing objects. 

So, the behavior is actually quite predictable, it's just that in some common 
cases it doesn't do what you would expect --- especially if you think that 
[0,1] is the same as :2.   When I wrote this code to begin with I should have 
raised an error and then worked in the cases that make sense.This is a good 
example of making the mistake of thinking that it's better to provide something 
very general rather than just raise an error when an obvious and clear solution 
is not available.  

There is the possibility that we could now raise an error in NumPy when this 
situation is encountered because I strongly doubt anyone is actually relying on 
the current behavior.I would like to do this, actually, as soon as 
possible.  Comments? 

-Travis




 
 Stéfan
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion