kohr-h opened a new issue #13166: [Python] __getitem__ copies too often URL: https://github.com/apache/incubator-mxnet/issues/13166 ## Description While working on #13134 I noticed that `NDArray.__getitem__` does not really do a good job of determining whether or not a view instead of a copy can be returned. For instance, wrapping any indexing object into a 1-element tuple should have no effect at all (at least if one wants to be consistent with NumPy), but the wrapping currently leads to a copy. Or, indexing a 3D array as `array[0]` should be equivalent to `array[0, :, :]`, but the latter copies while the former does not. Here are some examples: ```py def getitem_returns_view(x, idx): tmp = x.copy() tmp[:] = 0 y = tmp[idx] y[:] = 1 return all(yi == xi for yi, xi in zip(y.reshape((-1)), tmp[idx].reshape((-1,)))) x = nd.zeros((3, 4, 5)) ## Integers # Okay getitem_returns_view(x, 0) # True getitem_returns_view(x.asnumpy(), 0) # True # NOT okay getitem_returns_view(x, (0,)) # False getitem_returns_view(x.asnumpy(), (0,)) # True # Same with 1 instead of 0 ## Slices # Okay getitem_returns_view(x, slice(None)) # True getitem_returns_view(x.asnumpy(), slice(None)) # True # NOT okay getitem_returns_view(x, (slice(None),)) # True getitem_returns_view(x.asnumpy(), (slice(None),)) # False # NOT okay getitem_returns_view(x, (0, slice(None))) # True getitem_returns_view(x.asnumpy(), (0, slice(None))) # False # NOT okay getitem_returns_view(x, (0, slice(None), slice(None))) # True getitem_returns_view(x.asnumpy(), (0, slice(None), slice(None))) # False # NOT at all okay getitem_returns_view(x, (slice(None), slice(None), slice(None))) # False getitem_returns_view(x.asnumpy(), (slice(None), slice(None), slice(None))) # False ``` In particular the last case `array[:, :, :]`, which is explicitly *everything*, should definitely not copy. ## Suggestion Instead of the current way of special-casing integers, lists and slice objects, I think it would be more robust to always sanitize the indexing object, in particular: - If it's not a tuple, wrap it in a 1-tuple. - Expand it to length `array.ndim` by adding `slice(None)` to the right. Then the subsequent logic can look at all the entries in the tuple and determine whether the result will be contiguous (is it correct that the backend does not yet support strided views?). That's not entirely trivial but certainly doable. Of course, things get a little bit more involved if `None` (new axis) and `Ellipsis` (`...`) are allowed as well. But the code in https://github.com/apache/incubator-mxnet/pull/13143 already does the above sanitization for that case, it just doesn't always use it to avoid copies.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
