Hi list, I have added two phrase table types to Moses, internally called MultiModel and MultiModelCounts.
These table types construct a virtual phrase table online from a vector of component models. MultiModel so far only supports a linear interpolation of the probabilities in the component models, but the addition of new combination algorithms is possible. In principle, the combination is equivalent to the linear interpolation performed in the tmcombine scripts (already included in Moses), but because the combination/weighting is performed during decoding, a single phrase table can be adapted to multiple domains by simply providing different weights for each domain. Weights can be set for each sentence through an API hook to mosesserver. MultiModelCounts follows the same principle of constructing a virtual phrase table online, but does so on the basis of sufficient statistics, phrase (pair) frequencies. While an unweighted form corresponds to a model trained on the concatenation of all training data (excepting word alignment, pruning and rounding differences), the frequencies of each component models can be weighted online. Possible uses of the two phrase table types include quick domain adaptation in a multi-domain environment, and the quick addition/removal of models (without a need to retrain or run offline scripts). Applications in interactive machine translation are also conceivable (e.g. using a component model vector that is a mix of static background models and incrementally updateable ones). The architecture (and a possible application) are described in an upcoming ACL paper: Sennrich, Rico; Schwenk, Holger; Aransa, Walid (2013). A Multi-Domain Translation Model Framework for Statistical Machine Translation. Proceedings of ACL 2013. A documentation is available in http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc50 Rico _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
