reminisce opened a new pull request #14661: [numpy] Support zero-dim and zero-size tensors in MXNet URL: https://github.com/apache/incubator-mxnet/pull/14661 ## Description ## This PR provides the infrastructure of supporting zero-dim and zero-size tensors as the first outcome of the initiative of introducing NumPy compatible coding experience into MXNet (see this [RFC](https://github.com/apache/incubator-mxnet/issues/14253)). Great thanks to many folks who have contributed design, implementation and code review to this feature. The credits should go to all of them. Sorry if I missed anyone below. It would be impossible to make so many things correct scattered all over the codebase with just a couple of hands. - C++ backend and Python: @junrushao1994 @eric-haibin-lin @szha @zheng-da @wkcn @mli - MKLDNN: @pengzhao-intel @TaoLv @lihaofd @ZhennanQin - Scala: @yzhliu - R: @hetong007 - Perl: @sergeykolychev ## FAQ ### What are zero-dim and zero-size tensors? - Zero-dim tensors are scalars with shape equal to `()`. - A zero-size tensor is the one whose shape has at least one dimension size as `0`. For example, `np.ones((1, 0, 3))` would generate a zero-size tensor with shape equal to `(1, 0, 3)`. ### Why are they important? Mathematically speaking, their presence keeps the completeness and consistency of the all the tensor operations. For example, given `x = mx.nd.array([1, 2, 3])`, `x[0]` should return `1`, instead of `[1]`, and `x[0:0]` should return `[]` with shape equal to `(0,)`. Zero-size tensors can also be convenient in easing the code logic as placeholders for accumulations or aggregations. **I find [this thread](https://discuss.mxnet.io/t/rank-0-arrays-in-mxnet-aka-pi-is-wrong/108) provides very good insights on the importance of zero-dim tensors.** ### How are they implemented? In the backend (C++), we use `ndim = -1` to represent unknown shapes and `dim_size = -1` for unknown dim sizes. Before, they were represented by `0`. ### Is backward compatibility guaranteed in this PR? Yes, the backward compatibility is guaranteed by default. That means `0` still represents unknown `ndim` or dim size in any frontend language bindings. It's just converted to `-1` in the infer shape logic of the backend. ### How to enable zero-dim or zero-size tensors? Since we are committed to keep backward compatibility, we provided APIs for users to decide whether to opt in for this NumPy compatibility. Users can call `mx.set_np_comp(True)` to opt in and `mx.set_np_comp(False)` to opt out. Or in a safer way, use the `with` statement. To turn on this: ```python with mx.enable_np_comp(): # A scalar tensor's shape is `()`, whose `ndim` is `0`. scalar = mx.nd.ones(shape=()) assert scalar.shape == () # In NumPy compatible mode, 0 in a shape means that dimension contains zero elements. data = mx.sym.var("data", shape=(0, 2, 3)) ret = mx.sym.sin(data) arg_shapes, out_shapes, _ = ret.infer_shape() assert arg_shapes[0] == (0, 2, 3) assert out_shapes[0] == (0, 2, 3) # -1 means unknown shape dimension size in the new NumPy-compatible shape definition data = mx.sym.var("data", shape=(-1, 2, 3)) ret = mx.sym.sin(data) arg_shapes, out_shapes, _ = ret.infer_shape_partial() assert arg_shapes[0] == (-1, 2, 3) assert out_shapes[0] == (-1, 2, 3) # When a shape is completely unknown in NumPy-compatible mode, it is # represented as `None` in Python. data = mx.sym.var("data") ret = mx.sym.sin(data) arg_shapes, out_shapes, _ = ret.infer_shape_partial() assert arg_shapes[0] is None assert out_shapes[0] is None ``` or to disable this: ```python with mx.disable_np_comp(): # 0 means unknown shape dimension size in the legacy shape definition. data = mx.sym.var("data", shape=(0, 2, 3)) ret = mx.sym.sin(data) arg_shapes, out_shapes, _ = ret.infer_shape_partial() assert arg_shapes[0] == (0, 2, 3) assert out_shapes[0] == (0, 2, 3) # When a shape is completely unknown in the legacy mode (default), its ndim is # equal to 0 and it is represented as `()` in Python. data = mx.sym.var("data") ret = mx.sym.sin(data) arg_shapes, out_shapes, _ = ret.infer_shape_partial() assert arg_shapes[0] == () assert out_shapes[0] == () ``` ### Does this mean that every existing operator should support zero-dim or zero-size tensors now? Please note that the existing operators were implemented when these two types of tensors were not supported in MXNet. Some strong assumptions may have been made in their implementation and hence, lead to errors when NumPy compatibility is turned on. This PR only provides the infrastructure of supporting zero-dim and zero-size tensors in the backend. It does not guarantee that every existing operator would deal with zero-dim/size tensors correctly as in NumPy. As discussed in the [RFC](https://github.com/apache/incubator-mxnet/issues/14253), we are going to implement NumPy operators under `mxnet.numpy` that would fully support these two types of tensors.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
