shouldn't the masked softmax loss output be 
[2.30126,2.30126,0] ? 
because first sequence has 4 elements each equals to 2.30126,
second sequence has 2 elements each equals to 2.30126, 
dividing by their valid length means 2.30126*4/4=2.30126 and 
2.30126*2/2=2.30126.
it seems to me it's divided by 4 which includes the padding length, makes no 
sense!





---
[Visit Topic](https://discuss.mxnet.io/t/seq2seq-discussion/4357/2) or reply to 
this email to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.mxnet.io/email/unsubscribe/5f7df0f1af8a3231567bd25313942f9abff75488f8d3d2b27521a2d7faffd2e8).

Reply via email to