For ndindex (https://quansight.github.io/ndindex/), the biggest issue
with the API is that to use an ndindex object to actually index an
array, you have to use a[idx.raw] instead of a[idx]. This is because
for NumPy arrays, you cannot allow custom objects to be indices. The
exception is objects that define __index__, but this only works for
integer indices. If __index__ returns anything other than an integer,
you get an IndexError. This is annoying because it's easy to forget to
do this when working with the ndindex API, and the error message from
NumPy isn't informative about what went wrong unless you know to
expect it.

I'd like to propose an API that would allow custom objects to define
how they should be converted to a standard NumPy index, similar to
__index__ but that supports all index types. I think there are two
options here:

- Allow __index__ to return any index type, not just integers. This is
the simplest because it reuses an existing API, and __index__ is the
best possible name for this API. However, I'm not sure, but this may
actually conflict with the text of PEP 357
(https://www.python.org/dev/peps/pep-0357/). Also, some other APIs use
__index__ to check if something is an indexable integer, which
wouldn't accept generic index. For example, elements of a slice can be
any object that defines __index__.

- Add a new __numpy_index__ API that works like

def __numpy_index__(self):
    return <tuple, integer, slice, newaxis, ellipsis, or integer or
boolean array>

In NumPy, __getitem__ and __setitem__ on ndarray would first check if
the input index type is one of the known types as it currently does,
then it would try __index__, and if neither of those fails, it would
call __numpy_index__(index) and use that.

Note: there is a more general way that NumPy arrays could allow
__getitem__ to be defined on custom objects, which I am NOT proposing.
Instead of an API that returns one of the current predefined index
types (tuple, integer, slice, newaxis, ellipsis, or integer or boolean
array), there could instead be an API that takes the array as input
and returns another array (or view) as an output. This would allow an
object to define itself as an index in arbitrary ways, even if such an
index would not actually be possible via traditional indexing. There
are definitely some interesting ideas that could be done with this,
but this idea would be much more complicated, and isn't something that
I need. Unless the community feels that a more general API like this
would be preferred, I would suggest deferring something like it to a
later discussion.

What would be the best way to go about getting something like this
implemented? Is it simple enough that we can just work out the details
here and on a pull request, or should I write a NEP?

Aaron Meurer
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Reply via email to