Dear Moses Community,

This seems to be prickly topic to discuss but my experiments on a different
kind of data set than WMT or WAT (workshop for asian translation) has not
been able to achieve the stella scores that the recent advancement in MT
has been reporting.

Using state-of-art encoder-attention-decoder framework, just by running
things like lamtram or tensorflow, I'm unable to beat Moses' scores from
sentences that appears both in the train and test data.

Imagine it's a translator using MT and somehow he/she has translated the
sentence before and just wants the exact translation. A TM would solve the
problem and Moses surely could emulate the TM but NMT tends to go overly
creative and produces something else. Although it is consistent in giving
the same output for the same sentence, it's just unable to regurgitate the
sentence that was seen in the training data. In that matter, Moses does it
pretty well.

For sentences that is not in train but in test, NMT does fairly the same or
sometimes better than Moses.

So the question is 'has anyone encounter similar problems?' Is the solution
simply to do a fetch in the train set before translating? Or a
system/output chooser to rerank outputs?

Are there any other ways to resolve such a problem? What could have
happened such that NMT is not "remembering"? (Maybe it needs some
memberberries)

Any tips/hints/discussion on this is much appreciated.

Regards,
Nat
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to