Steven Huang <d98922047@...> writes: > > Hi, > I am trying to train an English to Chinese tree-to-tree model with manually generated corpus. The translation is unacceptable. It seems that the model doen't know reordering at all. So I look into the rule-table, there is no useful rule in it (see the attached file "rule-table.gz").
Hi Steven, the default rule extraction parameters are only suitable for unlabelled hierarchical models. For syntactic models, you probably want to set (at least) these parameters: -extract-options="--NonTermConsecSource --MinHoleSource 1" this allows nonterminals that span only a single word (MinHoleSource 1) and consecutive nonterminals (NonTermConsecSource). If you use the latter, you might want to look into scope-3 pruning of your grammar to keep decoding complexity low: http://www.aclweb.org/anthology/D10-1063.pdf some other options to consider: --MinWords 0 (to allow non-lexical rules) --MaxNonTerm SIZE (to allow SIZE nonterminals per rule (default 2)) I haven't built tree-to-tree systems myself, so I can't say which settings work best. best wishes, Rico _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
