On Thu, 2020-08-20 at 16:50 -0500, Sebastian Berg wrote: > On Thu, 2020-08-20 at 12:21 -0600, Aaron Meurer wrote: > > You're right. I was confusing the broadcasting logic for boolean > > arrays. > > > > However, I did find this example > > > > > > > np.arange(10).reshape((2, 5))[np.array([[0, 0, 0, 0, 0]], > > > > > dtype=np.int64), False] > > Traceback (most recent call last): > > File "<stdin>", line 1, in <module> > > IndexError: shape mismatch: indexing arrays could not be broadcast > > together with shapes (1,5) (0,) > > > > That certainly seems to imply there is some broadcasting being > > done. > > Yes, it broadcasts the array after converting it with `nonzero`, i.e. > its much the same as: > > indices = [[0, 0, 0, 0, 0]], *np.nonzero(False) > indices = np.broadcast_arrays(*indices) > > will give the same result (see also `np.ix_` which converts booleans > as > well for this reason, to give you outer indexing). > I was half way through a mock-up/pseudo code, but thought you likely > wasn't sure it was ending up clear. It sounds like things are > probably > falling into place for you (if they are not, let me know what might > help you):
Sorry editing error up there, in short I hope those steps sense to you, note that the broadcasting is basically part of a later "integer only" indexing step, and the `nonzero` part is pre-processing. > > 1. Convert all boolean indices into a series of integer indices using > `np.nonzero(index)` > > 2. For True/False scalars, that doesn't work, because `np.nonzero()`. > > `nonzero` gave us an index array (which is good, we obviously want > > one), but we need to index into `boolean_index.ndim == 0` > dimensions! > So that won't work, the approach using `nonzero` cannot generalize > > here, although boolean indices generalize perfectly. > > The solution to the dilemma is simple: If we have to index one > dimension, but should be indexing zero, then we simply add that > dimension to the original array (or at least pretend there was > an additional dimension). > > 3. Do normal indexing with the result *including broadcasting*, > we forget it was converted. > > The other way to solve it would be to always reshape the original > array > to combine all axes being indexed by a single boolean index into one > axis and then index it using `np.flatnonzero`. (But that would get a > different result if you try to broadcast!) > > > In any case, I am not sure I would bother with making sense of this, > except for sports! > Its pretty much nonsense and I think the time understanding it is > probably better spend deprecating it. The only reason I did not > Deprecate itt before, is that I tried to do be minimal in the changes > when I rewrote advanced indexing (and generalized boolean scalars > correctly) long ago. That was likely the right start/choice at the > time, since there were much bigger fish to catch, but I do not think > anything is holding us back now. > > Cheers, > > Sebastian > > > > Aaron Meurer > > > > On Wed, Aug 19, 2020 at 6:55 PM Sebastian Berg > > <sebast...@sipsolutions.net> wrote: > > > On Wed, 2020-08-19 at 18:07 -0600, Aaron Meurer wrote: > > > > > > 3. If you have multiple advanced indexing you get annoying > > > > > > broadcasting > > > > > > of all of these. That is *always* confusing for boolean > > > > > > indices. > > > > > > 0-D should not be too special there... > > > > > > > > OK, now that I am learning more about advanced indexing, this > > > > statement is confusing to me. It seems that scalar boolean > > > > indices do > > > > not broadcast. For example: > > > > > > Well, broadcasting means you broadcast the *nonzero result* > > > unless > > > I am > > > very confused... There is a reason I dismissed it. We could (and > > > arguably should) just deprecate it. And I have doubts anyone > > > would > > > even notice. > > > > > > > > > > np.arange(2)[False, np.array([True, False])] > > > > array([], dtype=int64) > > > > > > > np.arange(2)[tuple(np.broadcast_arrays(False, > > > > > > > np.array([True, > > > > > > > False])))] > > > > Traceback (most recent call last): > > > > File "<stdin>", line 1, in <module> > > > > IndexError: too many indices for array: array is 1-dimensional, > > > > but 2 > > > > were indexed > > > > > > > > And indeed, the docs even say, as you noted, "the nonzero > > > > equivalence > > > > for Boolean arrays does not hold for zero dimensional boolean > > > > arrays," > > > > which I guess also applies to the broadcasting. > > > > > > I actually think that probably also holds. Nonzero just behave > > > weird > > > for 0D because arrays (because it returns a tuple). > > > But since broadcasting the nonzero result is so weird, and since > > > 0- > > > D > > > booleans require some additional logic and don't generalize 100% > > > (code > > > wise), I won't rule out there are differences. > > > > > > > From what I can tell, the logic is that all integer and boolean > > > > arrays > > > > > > Did you try that? Because as I said above, IIRC broadcasting the > > > boolean array without first calling `nonzero` isn't really whats > > > going > > > on. And I don't know how it could be whats going on, since adding > > > dimensions to a boolean index would have much more implications? > > > > > > - Sebastian > > > > > > > > > > (and scalar ints) are broadcast together, *except* for boolean > > > > scalars. Then the first boolean scalar is replaced with and(all > > > > boolean scalars) and the rest are removed from the index. Then > > > > that > > > > index adds a length 1 axis if it is True and 0 if it is False. > > > > > > > > So they don't broadcast, but rather "fake broadcast". I still > > > > contend > > > > that it would be much more useful, if True were a synonym for > > > > newaxis > > > > and False worked like newaxis but instead added a length 0 > > > > axis. > > > > Alternately, True and False scalars should behave exactly like > > > > all > > > > other boolean arrays with no exceptions (i.e., work like > > > > np.nonzero(), > > > > broadcast, etc.). This would be less useful, but more > > > > consistent. > > > > > > > > Aaron Meurer > > > > _______________________________________________ > > > > NumPy-Discussion mailing list > > > > NumPy-Discussion@python.org > > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion@python.org > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion@python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion
signature.asc
Description: This is a digitally signed message part
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion