Hi, I've trained a model, binarised the LM and the phrase table, and have the following moses.ini config file:
######################### ### MOSES CONFIG FILE ### ######################### # input factors [input-factors] 0 # mapping steps [mapping] 0 T 0 [distortion-limit] 0 # feature functions [feature] UnknownWordPenalty WordPenalty PhrasePenalty PhraseDictionaryCompact name=TranslationModel0 num-features=4 path=.../phrase-table.minphr input-factor=0 output-factor=0 Distortion KENLM lazyken=0 name=LM0 factor=0 path=.../js.blm.lm order=3 # dense weights for feature functions [weight] # The default weights are NOT optimized for translation quality. You MUST tune the weights. # Documentation for tuning is here: http://www.statmt.org/moses/?n=FactoredTraining.Tuning UnknownWordPenalty0= 1 WordPenalty0= -1 PhrasePenalty0= 0.2 TranslationModel0= 0.2 0.2 0.2 0.2 Distortion0= 0.3 LM0= 0.5 Questions: 1. What do the four weights under TranslationModel0 correspond to? Reading "Tuning for Quality" in http://www.statmt.org/moses/?n=Moses.Tutorial suggests they are the weights of the four different components of the translation, in this order: (1) phrase translation, (2) language model, (3) reordering model, (4) word penalty. Is this true? 2. What is the range of these four weights? 3. Will setting one of them to 0 disable that component in the model? 4. Why is there a separate weight LM0= 0.5 if there is already a weight for the language model under TranslationModel0? 5. Why is there a separate weight WordPenalty0= -1 if there is already a weight for word penalty under TranslationModel0? Is it the case that WordPenalty0 describes how the word penalty component should behave (favoring longer/shorter output), and the weight for word penalty under TranslationModel0 describes how important this component should be, relative to the other three? 5.1. If so, what does the separate weight LM0= 0.5 mean? 5.2. "Model" in http://www.statmt.org/moses/?n=Moses.Background says that "Usually, this factor [ω (called word cost)] is larger than 1, biasing toward longer output." "Tuning for Quality" in http://www.statmt.org/moses/?n=Moses.Tutorial says that "Negative values for the word penalty favor longer output, positive values favor shorter output." Which one is it? 6. How do the different parameters described under "Model" in http://www.statmt.org/moses/?n=Moses.Background relate to the weights in moses.ini? - "We use a simple distortion model d(starti,endi-1) = α|starti-endi-1-1| with an appropriate value for the parameter α" -> α looks like the weight (3) reordering model - "In order to calibrate the output length, we introduce a factor ω (called word cost) for each generated English word" -> ω looks like the weight (4) word penalty - Which ones are the others? 7. What is the range for distortion-limit? Will setting distortion-limit= 0 ensure that the output has the same length as the input? 8. What does PhrasePenalty0= 0.2 do? I thought PhrasePenalty0 is always exp(1), which is not the same as 0.2 9. What does Distortion0= 0.3 do? There is already a weight (3) reordering model under TranslationModel0 10. Would setting UnknownWordPenalty0= 1 ensure that no penalty is incurred for copying unknown words verbatim to the output? Finally, where are all these parameters documented? I couldn't find any definitive source. The decoder parameter reference at http://www.statmt.org/moses/?n=Moses.DecoderParameters didn't illuminate me enough. Many, many thanks in advance. Bogdan _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
