Dear all,
I want to train a morphological analysis and generation model for moses,
based on which the further translation is from English to German.
And I have prepared my training data like this:
% tail -n 1 factored-corpus/proj-syndicate.??
==> factored-corpus/proj-syndicate.en <==
corruption|corruption|nn flourishes|flourish|nns .|.|.
==> factored-corpus/proj-syndicate.de <==
korruption|korruption|nn|nn.fem.cas.sg floriert|florieren|vvfin|vvfin
.|.|per|per
Each word is not only represented by its surface form , but also with
additional factors.
And both the English factors and that of German are surface form,lemma,part
of speech and morphy.
And now I want to know the best way to design the mapping steps for training
the factored translation model? Can you help me?
BTW, I have designed a total of four mapping steps such as below(for your
reference):
% train-model.perl \
--corpus factored-corpus/…… \
--root-dir morphgen \
--f de --e en \
--lm 0:3:factored-corpus/surface.lm:0 \
--lm 2:3:factored-corpus/pos.lm:0 \
--translation-factors 1-1+2-2,3 \
--generation-factors 1-2+1,2,3-0 \
--decoding-steps t0,g0,t1,g1 \
The above way for designation followed the Turorial for Using Factored
Models on the website:
http://www.statmt.org/moses/?n=Moses.FactoredTutorial#ntoc4
Your kind suggestions will be greatly appreciated!
Best Regards
Henry
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support