[
https://issues.apache.org/jira/browse/MXNET-376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16708069#comment-16708069
]
Chaitanya Prakash Bapat commented on MXNET-376:
-----------------------------------------------
[~anirudhacharya]
It looks like the above code doesn't give the same result as tensor flow does
For the same input
{code:python}
import numpy as np
data= np.random.rand(2,3,4)
print(data)
[[[0.18651888 0.29437149 0.45589573 0.55928469]
[0.1951702 0.50387834 0.38563502 0.49604304]
[0.12411261 0.67440557 0.50830552 0.67146303]]
[[0.52552375 0.06349533 0.78590277 0.36544202]
[0.6055249 0.317296 0.42477562 0.34462548]
[0.55422445 0.68032915 0.93749125 0.89412014]]]
{code}
Tensorflow output is
{code:python}
print(sess.run(tf.contrib.seq2seq.hardmax(tf.Variable(data))))
[[[0. 0. 0. 1.]
[0. 1. 0. 0.]
[0. 1. 0. 0.]]
[[0. 0. 1. 0.]
[1. 0. 0. 0.]
[0. 0. 1. 0.]]]
{code}
However, your code gives
{code:python}
xn = mx.nd.array(data)
xn_r = mx.nd.reshape(xn, shape=(2,12))
xn_e = mx.nd.eye(xn_r.shape[1], dtype=data.dtype)[mx.nd.argmax(xn_r, axis=1)]
hardmax_output = mx.nd.reshape(xn_e, shape=xn.shape)
print(hardmax_output)
[[[0. 0. 0. 0.]
[0. 1. 0. 0.]
[0. 0. 0. 0.]]
[[0. 1. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]]]
<NDArray 2x3x4 @cpu(0)>
{code}
That is, it finds the max across the first dimension (2 in [2,3,4])
But tensor flow on the other hand finds the max for all rows (being sequence to
sequence based (for NLP applications))
What do you reckon?
In that case I have implemented this Hardmax operator (similar to Tensorflow's)
- https://github.com/apache/incubator-mxnet/pull/13083
> Hardmax
> -------
>
> Key: MXNET-376
> URL: https://issues.apache.org/jira/browse/MXNET-376
> Project: Apache MXNet
> Issue Type: Sub-task
> Reporter: Hao Jin
> Priority: Major
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]