Hi Jörg, > (2) about factored models: is it possible to train generation steps from > other (monolingual) corpora (not the bitext) with Moses? Or would I have > to implement this myself? Should I expect a (significant) gain like when > using larger LM's? Anyone who tried that already?
Just run train-factored-phrase-model.perl separately with a different corpus name and the options --first-step 8 --last-step 8 to estimate the generation model on the additional data, then change your moses.ini file from the main run to point to the new generation model. The SL part of the corpus is not used in generation model creation, but you may have to create a dummy source language corpus file to get around the checks in the scripts (not sure about that). In one experiment with a very small parallel corpus (9,000 sentences), I obtained quite an impressive improvement by adding in a generation model trained on 900,000 monolingual sentences, so this might be worth doing. I imagine that the gains will be smaller if you use a more realistically- sized parallel base corpus, though. Cheers, Christian _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
