Hi,

I've trained a model, binarised the LM and the phrase table, and have
the following moses.ini config file:

#########################
### MOSES CONFIG FILE ###
#########################

# input factors
[input-factors]
0

# mapping steps
[mapping]
0 T 0

[distortion-limit]
0

# feature functions
[feature]
UnknownWordPenalty
WordPenalty
PhrasePenalty
PhraseDictionaryCompact name=TranslationModel0 num-features=4
path=.../phrase-table.minphr input-factor=0 output-factor=0
Distortion
KENLM lazyken=0 name=LM0 factor=0 path=.../js.blm.lm order=3

# dense weights for feature functions
[weight]
# The default weights are NOT optimized for translation quality. You
MUST tune the weights.
# Documentation for tuning is here:
http://www.statmt.org/moses/?n=FactoredTraining.Tuning
UnknownWordPenalty0= 1
WordPenalty0= -1
PhrasePenalty0= 0.2
TranslationModel0= 0.2 0.2 0.2 0.2
Distortion0= 0.3
LM0= 0.5


Questions:

1. What do the four weights under TranslationModel0 correspond to?
Reading "Tuning for Quality" in
http://www.statmt.org/moses/?n=Moses.Tutorial suggests they are the
weights of the four different components of the translation, in this
order: (1) phrase translation, (2) language model, (3) reordering
model, (4) word penalty. Is this true?

2. What is the range of these four weights?

3. Will setting one of them to 0 disable that component in the model?

4. Why is there a separate weight LM0= 0.5 if there is already a
weight for the language model under TranslationModel0?

5. Why is there a separate weight WordPenalty0= -1 if there is already
a weight for word penalty under TranslationModel0?

Is it the case that WordPenalty0 describes how the word penalty
component should behave (favoring longer/shorter output), and the
weight for word penalty under TranslationModel0 describes how
important this component should be, relative to the other three?

5.1. If so, what does the separate weight LM0= 0.5 mean?

5.2. "Model" in http://www.statmt.org/moses/?n=Moses.Background says
that "Usually, this factor [ω (called word cost)] is larger than 1,
biasing toward longer output."
"Tuning for Quality" in http://www.statmt.org/moses/?n=Moses.Tutorial
says that "Negative values for the word penalty favor longer output,
positive values favor shorter output."
Which one is it?

6. How do the different parameters described under "Model" in
http://www.statmt.org/moses/?n=Moses.Background relate to the weights
in moses.ini?
- "We use a simple distortion model d(starti,endi-1) =
α|starti-endi-1-1| with an appropriate value for the parameter α" -> α
looks like the weight (3) reordering model
- "In order to calibrate the output length, we introduce a factor ω
(called word cost) for each generated English word" -> ω looks like
the weight (4) word penalty
- Which ones are the others?

7. What is the range for distortion-limit? Will setting
distortion-limit= 0 ensure that the output has the same length as the
input?

8. What does PhrasePenalty0= 0.2 do? I thought PhrasePenalty0 is
always exp(1), which is not the same as 0.2

9. What does Distortion0= 0.3 do? There is already a weight (3)
reordering model under TranslationModel0

10. Would setting UnknownWordPenalty0= 1 ensure that no penalty is
incurred for copying unknown words verbatim to the output?

Finally, where are all these parameters documented? I couldn't find
any definitive source. The decoder parameter reference at
http://www.statmt.org/moses/?n=Moses.DecoderParameters didn't
illuminate me enough.

Many, many thanks in advance.
Bogdan

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to