Hello,

I would like to start using factored (POS-tagged) models instead of
unfactored ones, so I tried to follow the tutorial at
http://www.statmt.org/moses/?n=Moses.FactoredTutorial.

I downloaded the sample factored-corpus but instead of using the
provided SRILM language models, I want to use KenLM, so I proceeded
in the following way:

lmplz -o 5 < ../factored-corpus/proj-syndicate.1000.en > ../factored-
corpus/kenlm/proj-
syndicate.en.1000.arpa
lmplz -o 4 < ../factored-corpus/proj-syndicate.1000.de > ../factored-
corpus/kenlm/proj-
syndicate.de.1000.arpa

train-model.perl --root-dir pos-kenlm-small \
                 --corpus factored-corpus/proj-syndicate.1000 \
                 --f de --e en \
                 --lm 2:5:/home/moses/mt/moses3/factored-
corpus/kenlm/proj-syndicate.en.1000.arpa:8 \
                 --translation-factors 0-0,2 \
                 -mgiza \
                 --external-bin-dir ./training-tools

I am able to run the decoder:

echo "putin beschreibt menschen ." | moses -f pos-kenlm-
small/model/moses.ini
BEST TRANSLATION: putin|nnp describes|vbz people|nns

Now, I wanted to see for myself that the factored model is able to 
handle the situation
where the input sentence is reordered if we downweight the reordering 
model, just like in the 
abovementioned tutorial:

echo "menschen beschreibt putin ." | moses -f pos-kenlm-
small/model/moses.ini -dl -1
BEST TRANSLATION: people|nns describes|vbz putin|nnp

In the tutorial, a better translation is returned ("putin describes 
people"). Note that
instead of the "-d 0.2" option mentioned in the tutorial I used "-dl -1" 
to downweight the
reordering model as "-d" is no longer supported. I am not sure if that's 
correct.

Thank you for any advice.

Best regards,
Stanislav Kurik



_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to