Hi

I am interested to try using syntax models, and I have read the
"syntax tutorial" section in the manual, but I don't really understand
how it works. I guess it would be easier with an example, but I don't
understand neither how to use the files in the sample models archive
(what are the .dat files in the "rules" directory ? If I want to train
my own model, I must provide a syntactically annotated parallel
corpus. So, if I start from just a parallel corpus, I'll need to use
for example first a POS tagger, then a Collins parser, then the
wrapper script provided, and then call train-model.perl with
--{source,target}-syntax ?

I tried with a dummy corpus containing just this:
<tree label="PN"> das </tree> <tree label="V"> ist </tree> <tree
label="NP"> <tree label="DET"> ein </tree> <tree label="ADJ"> kleines
</tree> <tree label="NN"> haus </tree> </tree>
(and similar in english)

I called train-model.perl like this:
train-model.perl --corpus testfile -f de -e en -lm
0:3:europarl.srilm.gz --source-syntax --target-syntax
and got this error:
mkcls: StatVar.cpp:116: double StatVar::quantil(double): Assertion
`index>=0&&index<n' failed
Obviously there's something I'm doing wrong, but I don't know what.

By the way, train-model.perl is only in branches/mt3_chart, not in trunk ?

So, to summarize all this, if someone could expand the syntax
tutorial, and include an example of a simple training, I'd be
grateful.

Best regards,

-- 
Raphael
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to