Steven Huang <d98922047@...> writes:

> 
> Hi,
> I am trying to train an English to Chinese tree-to-tree model with
manually generated corpus. The translation is unacceptable. It seems that
the model doen't know reordering at all. So I look into the rule-table,
there is no useful rule in it (see the attached file "rule-table.gz").

Hi Steven,

the default rule extraction parameters are only suitable for unlabelled
hierarchical models. For syntactic models, you probably want to set (at
least) these parameters:

-extract-options="--NonTermConsecSource --MinHoleSource 1"

this allows nonterminals that span only a single word (MinHoleSource 1) and
consecutive nonterminals (NonTermConsecSource). If you use the latter, you
might want to look into scope-3 pruning of your grammar to keep decoding
complexity low: http://www.aclweb.org/anthology/D10-1063.pdf

some other options to consider:

--MinWords 0 (to allow non-lexical rules)
--MaxNonTerm SIZE (to allow SIZE nonterminals per rule (default 2))

I haven't built tree-to-tree systems myself, so I can't say which settings
work best.

best wishes,
Rico

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to