shouldn't the masked softmax loss output be [2.30126,2.30126,0] ? because first sequence has 4 elements each equals to 2.30126, second sequence has 2 elements each equals to 2.30126, dividing by their valid length means 2.30126*4/4=2.30126 and 2.30126*2/2=2.30126. it seems to me it's divided by 4 which includes the padding length, makes no sense!
--- [Visit Topic](https://discuss.mxnet.io/t/seq2seq-discussion/4357/2) or reply to this email to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.mxnet.io/email/unsubscribe/5f7df0f1af8a3231567bd25313942f9abff75488f8d3d2b27521a2d7faffd2e8).
