Hi all, I would like to combine two phrase based engines. One, smaller is trained on desired domain data, but with limited corpus size. The second is the legacy one with huge phrase table and LM, but with kind of older/more obsolete terminology. Thus the idea is to combine both to preserve domain/language style from the first engine, but also reduce OOV with application of the second engine.
I think what I am looking for is the Back-off model - use the small one as a preferred one , and then the second in case of phrases not found. I have setup such a config in accordance with http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc25,. [feature] PhraseDictionaryCompact name=A PhraseDictionaryCompact name=BackOff [mapping] 0 T 0 1 T 1 [decoding-graph-backoff] 0 1 [weight] A = 0 0 0 0 BackOff = 0 0 0 0 And it seems to work (weights were tuned afterwards with mert). I have also read the http://comments.gmane.org/gmane.comp.nlp.moses.user/10099. However there is not mentioned how the LMs combination could be managed. I can add both to ini file and perform the weights tuning, or is it better to set the weights manually? I believe that phrase table backoff would ensure the preference of model A terminology, while combination of both LMs would make the translation smoother as it can benefit from the second, bigger LM. Could you please correct my assumptions? I hope the explanation does make some sense... Thank you very much, Tomas
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
