Hi,

I am trying to train an English to Chinese tree-to-tree model with manually
generated corpus. The translation is unacceptable. It seems that the model
doen't know reordering at all. So I look into the rule-table, there is no
useful rule in it (see the attached file "rule-table.gz").

Here is how I generate the training corpus (see the attached file "tree.en"
and "tree.ch"):
The basic sentence is "What should Steven do?"
I made all combinations with the following rule:
1. "What" is replaced by "When" or "What".
2. "should" is replaced by "can" or "should".
3. "Steven" is replaced by 10 different names.
4. "do" is replaced by "say", "think", or "do"

The train-model command I used is shown below:
$MOSES_DIR/scripts/training/train-model.perl \
--root-dir train \
--mgiza \
--mgiza-cpus 20 \
--corpus $WORK_DIR/tree \
--f en \
--e ch \
--lm 0:3:$LANG_MOD_DIR/utf8/en-ch.blm.ch:8 \
--source-syntax \
--target-syntax \
--glue-grammar \
--max-phrase-length 10 \
--alignment grow-diag-final-and \
--external-bin-dir $MOSES_DIR/tools

Is there any suggestions?

Best,

Steven Huang

Attachment: rule-table.gz
Description: GNU Zip compressed data

Attachment: tree.ch
Description: Binary data

Attachment: tree.en
Description: Binary data

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to