kohr-h opened a new issue #13166: [Python] __getitem__ copies too often
URL: https://github.com/apache/incubator-mxnet/issues/13166
 
 
   ## Description
   
   While working on #13134 I noticed that `NDArray.__getitem__` does not really 
do a good job of determining whether or not a view instead of a copy can be 
returned.
   
   For instance, wrapping any indexing object into a 1-element tuple should 
have no effect at all (at least if one wants to be consistent with NumPy), but 
the wrapping currently leads to a copy.
   Or, indexing a 3D array as `array[0]` should be equivalent to `array[0, :, 
:]`, but the latter copies while the former does not.
   
   Here are some examples:
   ```py
   def getitem_returns_view(x, idx):
       tmp = x.copy()
       tmp[:] = 0
       y = tmp[idx]
       y[:] = 1
       return all(yi == xi for yi, xi in zip(y.reshape((-1)), 
tmp[idx].reshape((-1,))))
   
   x = nd.zeros((3, 4, 5))
   
   ## Integers
   
   # Okay
   getitem_returns_view(x, 0)  # True
   getitem_returns_view(x.asnumpy(), 0)  # True
   
   # NOT okay
   getitem_returns_view(x, (0,))  # False
   getitem_returns_view(x.asnumpy(), (0,))  # True
   
   # Same with 1 instead of 0
   
   ## Slices
   
   # Okay
   getitem_returns_view(x, slice(None))  # True
   getitem_returns_view(x.asnumpy(), slice(None))  # True
   
   # NOT okay
   getitem_returns_view(x, (slice(None),))  # True
   getitem_returns_view(x.asnumpy(), (slice(None),))  # False
   
   # NOT okay
   getitem_returns_view(x, (0, slice(None)))  # True
   getitem_returns_view(x.asnumpy(), (0, slice(None)))  # False
   
   # NOT okay
   getitem_returns_view(x, (0, slice(None), slice(None)))  # True
   getitem_returns_view(x.asnumpy(), (0, slice(None), slice(None)))  # False
   
   # NOT at all okay
   getitem_returns_view(x, (slice(None), slice(None), slice(None)))  # False
   getitem_returns_view(x.asnumpy(), (slice(None), slice(None), slice(None)))  
# False
   ```
   
   In particular the last case `array[:, :, :]`, which is explicitly 
*everything*, should definitely not copy.
   
   ## Suggestion
   
   Instead of the current way of special-casing integers, lists and slice 
objects, I think it would be more robust to always sanitize the indexing 
object, in particular:
   
   - If it's not a tuple, wrap it in a 1-tuple.
   - Expand it to length `array.ndim` by adding `slice(None)` to the right.
   
   Then the subsequent logic can look at all the entries in the tuple and 
determine whether the result will be contiguous (is it correct that the backend 
does not yet support strided views?). That's not entirely trivial but certainly 
doable.
   
   Of course, things get a little bit more involved if `None` (new axis) and 
`Ellipsis` (`...`) are allowed as well. But the code in 
https://github.com/apache/incubator-mxnet/pull/13143 already does the above 
sanitization for that case, it just doesn't always use it to avoid copies.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to