Nick Guletskii created MXNET-1382:
-------------------------------------
Summary: IndexArray operator
Key: MXNET-1382
URL: https://issues.apache.org/jira/browse/MXNET-1382
Project: Apache MXNet
Issue Type: New Feature
Components: Apache MXNet Backend
Reporter: Nick Guletskii
h1. Description
IndexArray is an operator that returns an array of indexes of the input array.
For an input array with shape (d_1, d_2, ..., d_n), _index_array_ returns a
(d_1, d_2, ..., d_n, n) array _idx_, where idx[i_1, i_2, ..., i_n, :] = [i_1,
i_2, ..., i_n].
Additionally, when the parameter _axes_ is specified, _idx_ will be a
(d_1, d_2, ..., d_n, m) array where _m_ is the length of _axes_, and the
following
equality will hold: idx[i_1, i_2, ..., i_n, j] = i_\{axes[j]}.
h1. Motivation
This operator can be used to generate meshgrids for tensors without knowing
their exact shapes during construction. For instance, this operator can be used
to make a makeshift prior box generator for anchor-based computer vision models:
{code:java}
feature_map = F.ones((8, 128, 128, 256)) # N x H x W x C, no shape information
when using the Symbol API.
prior_box_stride = 16
box_size=[8, 8]
template = F.squeeze(F.slice_axis(feature_map, begin=0, end=1, axis=-1),
axis=-1) # N x H x W
box_centres = F.contrib.index_array(template, axes=(-2, -1, -2,
-1)).astype("float32") # N x H x W x 4
box_centres = F.broadcast_mul(box_centres,
F.array([prior_box_stride]).reshape((1, 1, 1, 1))) # N x H x W x 4
corner_offsets = F.array(box_size).reshape((1, 1, 1, 2))
corner_offsets = F.concat(-corner_offsets/2, corner_offsets/2, dim=-1)
box_corners = F.broadcast_plus(box_centres, corner_offsets){code}
Also, this operator can be applied to implement positional encodings for
sequence processing, e.g.:
{code:java}
sequence_embeddings = F.ones((65, 8, 256)) # T x N x C, no shape information
when using the Symbol API.
template = sequence_embeddings.reshape((0, 0, -1, 2)) # T x N x C -> T x N x
(C/2) x 2
pos, i = F.split(
F.contrib.index_array(template, axes=(0, 2)).astype("float32"), # T x N x
(C/2) x 2 x 2
axis=-1,
num_outputs=2,
squeeze_axis=True
) # T x N x (C/2) x 2 and T x N x (C/2) x 2
base = F.ones((1, 1, 1, 1)) * 10000
dmodel = F.slice_axis(F.shape_array(sequence_embeddings), begin=-1, end=None,
axis=0)
dmodel = dmodel.reshape((1, 1, 1, 1)).astype("float32")
tmp = F.broadcast_div(pos, F.broadcast_power(base, F.broadcast_div(2 * i,
dmodel))) # T x N x (C/2) x 2
sin_input, cos_input = F.split(tmp, axis=-1, num_outputs=2, squeeze_axis=True)
# T x N x (C/2) and T x N x (C/2)
positional_encoding = F.stack(F.sin(sin_input), F.cos(cos_input),
axis=-1).reshape((0, 0, -3)) # T x N x C{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]