reminisce opened a new pull request #14661: [numpy] Support zero-dim and 
zero-size tensors in MXNet
URL: https://github.com/apache/incubator-mxnet/pull/14661
 
 
   ## Description ##
   This PR provides the infrastructure of supporting zero-dim and zero-size 
tensors as the first outcome of the initiative of introducing NumPy compatible 
coding experience into MXNet (see this 
[RFC](https://github.com/apache/incubator-mxnet/issues/14253)). Great thanks to 
many folks who have contributed design, implementation and code review to this 
feature. The credits should go to all of them. Sorry if I missed anyone below. 
It would be impossible to make so many things correct scattered all over the 
codebase with just a couple of hands.
   - C++ backend and Python: @junrushao1994 @eric-haibin-lin @szha @zheng-da 
@wkcn @mli 
   - MKLDNN: @pengzhao-intel @TaoLv @lihaofd @ZhennanQin
   - Scala: @yzhliu 
   - R: @hetong007 
   - Perl: @sergeykolychev 
   
   ## FAQ
   ### What are zero-dim and zero-size tensors?
   - Zero-dim tensors are scalars with shape equal to `()`.
   - A zero-size tensor is the one whose shape has at least one dimension size 
as `0`. For example, `np.ones((1, 0, 3))` would generate a zero-size tensor 
with shape equal to `(1, 0, 3)`.
   
   ### Why are they important?
   Mathematically speaking, their presence keeps the completeness and 
consistency of the all the tensor operations. For example, given `x = 
mx.nd.array([1, 2, 3])`, `x[0]` should return `1`, instead of `[1]`, and 
`x[0:0]` should return `[]` with shape equal to `(0,)`. Zero-size tensors can 
also be convenient in easing the code logic as placeholders for accumulations 
or aggregations.
   
   **I find [this 
thread](https://discuss.mxnet.io/t/rank-0-arrays-in-mxnet-aka-pi-is-wrong/108) 
provides very good insights on the importance of zero-dim tensors.**
   
   ### How are they implemented?
   In the backend (C++), we use `ndim = -1` to represent unknown shapes and 
`dim_size = -1` for unknown dim sizes. Before, they were represented by `0`.
   
   ### Is backward compatibility guaranteed in this PR?
   Yes, the backward compatibility is guaranteed by default. That means `0` 
still represents unknown `ndim` or dim size in any frontend language bindings. 
It's just converted to `-1` in the infer shape logic of the backend.
   
   ### How to enable zero-dim or zero-size tensors?
   Since we are committed to keep backward compatibility, we provided APIs for 
users to decide whether to opt in for this NumPy compatibility. Users can call 
`mx.set_np_comp(True)` to opt in and `mx.set_np_comp(False)` to opt out. Or in 
a safer way, use the `with` statement. To turn on this:
   ```python
    with mx.enable_np_comp():
        # A scalar tensor's shape is `()`, whose `ndim` is `0`.
        scalar = mx.nd.ones(shape=())
        assert scalar.shape == ()
    
        # In NumPy compatible mode, 0 in a shape means that dimension contains 
zero elements.
        data = mx.sym.var("data", shape=(0, 2, 3))
        ret = mx.sym.sin(data)
        arg_shapes, out_shapes, _ = ret.infer_shape()
        assert arg_shapes[0] == (0, 2, 3)
        assert out_shapes[0] == (0, 2, 3)
    
        # -1 means unknown shape dimension size in the new NumPy-compatible 
shape definition
        data = mx.sym.var("data", shape=(-1, 2, 3))
        ret = mx.sym.sin(data)
        arg_shapes, out_shapes, _ = ret.infer_shape_partial()
        assert arg_shapes[0] == (-1, 2, 3)
        assert out_shapes[0] == (-1, 2, 3)
    
        # When a shape is completely unknown in NumPy-compatible mode, it is
        # represented as `None` in Python.
        data = mx.sym.var("data")
        ret = mx.sym.sin(data)
        arg_shapes, out_shapes, _ = ret.infer_shape_partial()
        assert arg_shapes[0] is None
        assert out_shapes[0] is None
   ```
   or to disable this:
   ```python
    with mx.disable_np_comp():
        # 0 means unknown shape dimension size in the legacy shape definition.
        data = mx.sym.var("data", shape=(0, 2, 3))
        ret = mx.sym.sin(data)
        arg_shapes, out_shapes, _ = ret.infer_shape_partial()
        assert arg_shapes[0] == (0, 2, 3)
        assert out_shapes[0] == (0, 2, 3)
    
        # When a shape is completely unknown in the legacy mode (default), its 
ndim is
        # equal to 0 and it is represented as `()` in Python.
        data = mx.sym.var("data")
        ret = mx.sym.sin(data)
        arg_shapes, out_shapes, _ = ret.infer_shape_partial()
        assert arg_shapes[0] == ()
        assert out_shapes[0] == ()
   ```
   ### Does this mean that every existing operator should support zero-dim or 
zero-size tensors now?
   Please note that the existing operators were implemented when these two 
types of tensors were not supported in MXNet. Some strong assumptions may have 
been made in their implementation and hence, lead to errors when NumPy 
compatibility is turned on. This PR only provides the infrastructure of 
supporting zero-dim and zero-size tensors in the backend. It does not guarantee 
that every existing operator would deal with zero-dim/size tensors correctly as 
in NumPy. As discussed in the 
[RFC](https://github.com/apache/incubator-mxnet/issues/14253), we are going to 
implement NumPy operators under `mxnet.numpy` that would fully support these 
two types of tensors.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to