Dear Moses Community, On recent papers, there has been much BLEU scores reported on ensemble of neural machine translation systems. I would like to ask whether any one know how are these ensembles created?
Is it some sort of averaged pooling layer at the end? Is it some sort of voting of multiple system when the system is decoding at every time step? Any pointers to papers describing this magical ensemble would be great =) Most papers just say that, we ensemble, we beat Moses. Are there cases where a single model beat Moses in a normal translation task without ensembling? Regards, Nat
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
