[Moses-support] Phrases containing brackets mistaken for malformed nonterminals

Markus Saers Thu, 28 Apr 2016 08:01:11 -0700

Hello,

I am having problems reading in a phrase table derived from a corpus
that (I have learned now) contained bracketed expressions such as "to
like [someone]". I appears that Moses confuses these string with
nonterminals. I built a regular phrase-based model, so I was a bit
confused when it contained malformed nonterminals. Is there any way to
tell Moses that this is a regular phrase-based model, and that it
(he?) shouldn't look for nonterminals?


/Markus


The error message I got was:

Exception: moses/Phrase.cpp:214 in void
Moses::Phrase::CreateFromString(Moses::FactorDirection, const
std::vector<long unsigned int>&, const StringPiece&, Moses::Word**)
threw util::Exception because `nextPos == string::npos'.
Incorrect formatting of non-terminal. Should have 2 non-terms, eg.
[X][X]. Current string: [someone]
Exit code: 1

And my moses.ini contains the following:

#########################
### MOSES CONFIG FILE ###
#########################

# input factors
[input-factors]
0

# mapping steps
[mapping]
0 T 0

[distortion-limit]
6

# feature functions
[feature]
UnknownWordPenalty
WordPenalty
PhrasePenalty
PhraseDictionaryMemory name=TranslationModel0 num-features=4
path=path/to/phrase-table.gz input-factor=0 output-factor=0
LexicalReordering name=LexicalReordering0 num-features=6
type=wbe-msd-bidirectional-fe-allff
input-factor=0 output-factor=0
path=path/to/reordering-table.wbe-msd-bidirectional-fe.gz
Distortion
KENLM name=LM0 factor=0 path=path/to/lm.arpa order=4

# dense weights for feature functions
[weight]
UnknownWordPenalty0= 1
WordPenalty0= -1
PhrasePenalty0= 0.2
TranslationModel0= 0.2 0.2 0.2 0.2
LexicalReordering0= 0.3 0.3 0.3 0.3 0.3 0.3
Distortion0= 0.3
LM0= 0.5
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

[Moses-support] Phrases containing brackets mistaken for malformed nonterminals

Reply via email to