Hi Chris,

On 01/27/2012 08:47 PM, Chris Tanner wrote:
Hello everyone,

I'm having trouble understanding the -distortion-file parameters that are passed into Moses; it's reported that Moses accepts:

    > -distortion-file: source factors (0 if table independent of
    source), target factors, location of the factorized/lexicalized
    reordering tables


Can someone please further explain what each of these correspond to? For example, having searched through Moses' support mailing list, I see an example whereby someone had passed:

> -distortion-file 0-1 wbe-msd-bidirectional-fe-allff 6 /home/meirong-moses/fac-sur+pos+stem/model/reordering-table.0-1.wbe-msd-bidirectional-fe


Question 1/Param 1: Do the source factors have to sum to 1 and can be any double (e.g., .4-.6 would be valid?)
The factors are integers, referring to the factor numbers used for training the distortion model. It's the same format that are used for factor specifications elsewhere in Moses. See for instance this page: http://www.statmt.org/moses/?n=FactoredTraining.FactoredTraining for an explanation

Question 2/Param 2: This is the base filename that corresponds to the output of processLexicalTable.exe?
No, this is not the filename, it is the specification of which type of model it is. See here for information: http://www.statmt.org/moses/?n=FactoredTraining.BuildReorderingModel

Question 3/Param 3: 6? This is an example of a target factor? It seems different than the source factor example of 0-1.
This is the number of factors in the model. It should correspond to the number of scores in the model file. In this case it is 6 since it is an msd-model, with three scores for each direction, since it is also bidirectional.

Question 4/Param 4: What is this file? Is it the base filename that corresponds to the output of processLexicalTable, or maybe it's the lexical reordering scoring table (i.e., the output of having run scripts/training/lexical-reordering/score.cpp?)
This is the filename of the reordering model, which is the same as the lexical reordering scoring table. This type of filename is the result of running train-model.perl. But as you say, it is the file that comes out of scripts/training/lexical-reordering/score.cpp. It could also have been binarized by running processLexicalTable afterwards.


Any other insights and comments would be much appreciated.

Thanks!


--
Chris Tanner

/Sara

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to