Hi all,

I am currently working on a little statMT-research and I would like to
do a little focus on factored models. After a struggle with my corpus
(subset Europarl EN/NL) to get it to work with Moses (spacing, correct
factor generation, charset, etc), Moses is finally able to train a
language model on it. However, when decoding, moses quickly dies
giving the following message:

---
Finished loading phrase tables : [14.000] seconds
IO from STDOUT/STDIN
Created input-output object : [14.000] seconds
Translating: ik|Pron|ik loop|V|lop over|Prep|over de|Art|de
straat|N|strat .|Punc|.

moses: LanguageModelSRI.cpp:154: virtual float
LanguageModelSRI::GetValue(const std::vector<const Word*,
std::allocator<const Word*> >&, const void**, unsigned int*) const:
Assertion `(*contextFactor[count-1])[factorType] != __null' failed.
Aborted
---

I am using the i386 binaries from the Ubuntu-NLP archive:
- Moses 20080525svn-1nlp3~0gutsy1
- Giza++ 2.0.20030930gcc41-3nlp1~0gutsy1
- SriLM 1.5.6-1nlp1~0gutsy1

I've trained using the following commandline:

train-factored-phrase-model.perl \
        --root-dir . \
        --f nl --e en \
        --corpus corpus/euro \
        --alignment-factors 0,1,2-0,1,2 \
        --translation-factors 1-1+2-2 \
        --generation-factors 2,1-0 \
        --lm 0:3:corpus/surface.lm:0 \
        --lm 1:3:corpus/pos.lm:0 \
        --lm 2:3:corpus/stem.lm:0

I am decoding with the generated moses.ini from above commandline,
without tweaks. The sentence to be decoded is Dutch (nl), and is
prepared by the same factor producing chain as the nl-half of the
corpus was.

Since I am not that much of a code guru, I was hoping someone on this
list would be able to help me. Am I doing something wrong, or is this
a bug?

With kind regards,

Jorik Jonker
MSc student University of Twente, Netherlands
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to