Dear Joerg, as for the symmetrization heuristics, yes, they are not symmetrical. I haven't carefully studied the details, but you'd have to use just union or intersection to obtain the same alignments (and thus the same lexical translation tables).
As for (2), I wanted to extend train-factored-... script for this, but I haven't done it yet. The workaround suggested by Christian should work, of course. Your question is a good incentive anyway! I'll let you know when I'm finished (or if move to something different ;-). Ondrej. Christian Hardmeier wrote: > Hi Jörg, > >> (2) about factored models: is it possible to train generation steps from >> other (monolingual) corpora (not the bitext) with Moses? Or would I have >> to implement this myself? Should I expect a (significant) gain like when >> using larger LM's? Anyone who tried that already? > > Just run train-factored-phrase-model.perl separately with a different > corpus name and the options --first-step 8 --last-step 8 to estimate the > generation model on the additional data, then change your moses.ini file > from the main run to point to the new generation model. The SL part of > the corpus is not used in generation model creation, but you may have to > create a dummy source language corpus file to get around the checks in > the scripts (not sure about that). > > In one experiment with a very small parallel corpus (9,000 sentences), I > obtained quite an impressive improvement by adding in a generation model > trained on 900,000 monolingual sentences, so this might be worth doing. > I imagine that the gains will be smaller if you use a more realistically- > sized parallel base corpus, though. > > Cheers, > Christian > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
