Hi all,

I would like to combine two phrase based engines. One, smaller is trained on 
desired domain data, but with limited corpus size. The second is the legacy one 
with huge phrase table and LM, but with kind of older/more obsolete 
terminology.  Thus the idea is to combine both to preserve domain/language 
style from the first engine, but also reduce OOV with application of the second 
engine.

I think what I am looking for is the Back-off model - use the small one as a 
preferred one , and then the second in case of phrases not found. I have setup 
such a config in accordance with 
http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc25,.

[feature]
PhraseDictionaryCompact name=A
PhraseDictionaryCompact name=BackOff

[mapping]
0 T 0
1 T 1

[decoding-graph-backoff]
0
1

[weight]
A = 0 0 0 0
BackOff = 0 0 0 0

And it seems to work (weights were tuned afterwards with mert).

I have also read the http://comments.gmane.org/gmane.comp.nlp.moses.user/10099. 
However there is not mentioned how the LMs  combination could be managed. I can 
add both to ini file and perform the weights tuning, or is it better to set the 
weights manually? I believe that phrase table backoff would ensure the 
preference of model A terminology, while combination of both LMs would make the 
translation smoother as it can benefit from the second, bigger LM.

Could you please correct my assumptions?  I hope the explanation does make some 
sense...

Thank you very much,

Tomas
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to