Hello,

I'm trying to understand how array broadcasting can be used for indexing. In 
the following, I use the term 'row' to refer to the first dimension of a 2D 
array, and 'column' to the second, just because that's how numpy prints them 
out.

If I consider the following example:

>>> a = np.random.random((4,5))
>>> b = np.random.random((5,))
>>> a + b
array([[ 1.45499556,  0.60633959,  0.48236157,  1.55357393,  1.4339261 ],
       [ 1.28614593,  1.11265001,  0.63308615,  1.28904227,  1.34070499],
       [ 1.26988279,  0.84683018,  0.98959466,  0.76388223,  0.79273084],
       [ 1.27859505,  0.9721984 ,  1.02725009,  1.38852061,  1.56065028]])

I understand how this works, because it works as expected as described in

http://docs.scipy.org/doc/numpy/reference/ufuncs.html#broadcasting

So b gets broadcast to shape (1,5), then because the first dimension is 1, the 
operation is applied to all rows.

Now I am trying to apply this to array indexing. So for example, I want to set 
specific columns, indicated by a boolean array, to zero, but the following 
fails:

>>> c = np.array([1,0,1,0,1], dtype=bool)
>>> a[c] = 0
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: index (4) out of range (0<=index<3) in dimension 0

However, if I try reducing the size of c to 4, then it works, and sets rows, 
not columns, equal to zero

>>> c = np.array([1,0,1,0], dtype=bool)
>>> a[c] = 0
>>> a
array([[ 0.        ,  0.        ,  0.        ,  0.        ,  0.        ],
       [ 0.41526315,  0.7425491 ,  0.39872546,  0.56141914,  0.69795153],
       [ 0.        ,  0.        ,  0.        ,  0.        ,  0.        ],
       [ 0.40771227,  0.60209749,  0.7928894 ,  0.66089748,  0.91789682]])

But I would have thought that the indexing array would have been broadcast in 
the same way as for a sum, i.e. c would be broadcast to have dimensions (1,5) 
and then would have been able to set certain columns in all rows to zero. 

Why is it that for indexing, the broadcasting seems to happen in a different 
way than when performing operations like additions or multiplications? For 
background info, I'm trying to write a routine which performs a set of 
operations on an n-d array, where n is not known in advance, with a 1D array, 
so I can use broadcasting rules for most operations without knowing the 
dimensionality of the n-d array, but now that I need to perform indexing, and 
the convention seems to change, this is a real issue.

Thanks in advance for any advice,

Thomas
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to