Nick Guletskii created MXNET-1382:
-------------------------------------

             Summary: IndexArray operator
                 Key: MXNET-1382
                 URL: https://issues.apache.org/jira/browse/MXNET-1382
             Project: Apache MXNet
          Issue Type: New Feature
          Components: Apache MXNet Backend
            Reporter: Nick Guletskii


h1. Description

IndexArray is an operator that returns an array of indexes of the input array.


 For an input array with shape (d_1, d_2, ..., d_n), _index_array_ returns a 
(d_1, d_2, ..., d_n, n) array _idx_, where idx[i_1, i_2, ..., i_n, :] = [i_1, 
i_2, ..., i_n].

Additionally, when the parameter _axes_ is specified, _idx_ will be a
(d_1, d_2, ..., d_n, m) array where _m_ is the length of _axes_, and the 
following
equality will hold: idx[i_1, i_2, ..., i_n, j] = i_\{axes[j]}.
h1. Motivation

This operator can be used to generate meshgrids for tensors without knowing 
their exact shapes during construction. For instance, this operator can be used 
to make a makeshift prior box generator for anchor-based computer vision models:
{code:java}
feature_map = F.ones((8, 128, 128, 256)) # N x H x W x C, no shape information 
when using the Symbol API.
prior_box_stride = 16
box_size=[8, 8]

template = F.squeeze(F.slice_axis(feature_map, begin=0, end=1, axis=-1), 
axis=-1) # N x H x W
box_centres = F.contrib.index_array(template, axes=(-2, -1, -2, 
-1)).astype("float32") # N x H x W x 4
box_centres = F.broadcast_mul(box_centres, 
F.array([prior_box_stride]).reshape((1, 1, 1, 1))) # N x H x W x 4
corner_offsets = F.array(box_size).reshape((1, 1, 1, 2))
corner_offsets = F.concat(-corner_offsets/2, corner_offsets/2, dim=-1)
box_corners = F.broadcast_plus(box_centres, corner_offsets){code}
Also, this operator can be applied to implement positional encodings for 
sequence processing, e.g.:
{code:java}
sequence_embeddings = F.ones((65, 8, 256)) # T x N x C, no shape information 
when using the Symbol API.
template = sequence_embeddings.reshape((0, 0, -1, 2)) # T x N x C -> T x N x 
(C/2) x 2
pos, i = F.split(
    F.contrib.index_array(template, axes=(0, 2)).astype("float32"), # T x N x 
(C/2) x 2 x 2
    axis=-1,
    num_outputs=2,
    squeeze_axis=True
) # T x N x (C/2) x 2 and T x N x (C/2) x 2
base = F.ones((1, 1, 1, 1)) * 10000
dmodel = F.slice_axis(F.shape_array(sequence_embeddings), begin=-1, end=None, 
axis=0)
dmodel = dmodel.reshape((1, 1, 1, 1)).astype("float32")
tmp = F.broadcast_div(pos, F.broadcast_power(base, F.broadcast_div(2 * i,  
dmodel))) # T x N x (C/2) x 2
sin_input, cos_input = F.split(tmp, axis=-1, num_outputs=2, squeeze_axis=True) 
# T x N x (C/2) and T x N x (C/2)
positional_encoding = F.stack(F.sin(sin_input), F.cos(cos_input), 
axis=-1).reshape((0, 0, -3)) # T x N x C{code}
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to