Hi Chris,
On 01/27/2012 08:47 PM, Chris Tanner wrote:
Hello everyone,
I'm having trouble understanding the -distortion-file parameters that
are passed into Moses; it's reported that Moses accepts:
> -distortion-file: source factors (0 if table independent of
source), target factors, location of the factorized/lexicalized
reordering tables
Can someone please further explain what each of these correspond to?
For example, having searched through Moses' support mailing list, I
see an example whereby someone had passed:
> -distortion-file 0-1 wbe-msd-bidirectional-fe-allff 6
/home/meirong-moses/fac-sur+pos+stem/model/reordering-table.0-1.wbe-msd-bidirectional-fe
Question 1/Param 1: Do the source factors have to sum to 1 and can be
any double (e.g., .4-.6 would be valid?)
The factors are integers, referring to the factor numbers used for
training the distortion model. It's the same format that are used for
factor specifications elsewhere in Moses. See for instance this page:
http://www.statmt.org/moses/?n=FactoredTraining.FactoredTraining for an
explanation
Question 2/Param 2: This is the base filename that corresponds to the
output of processLexicalTable.exe?
No, this is not the filename, it is the specification of which type of
model it is. See here for information:
http://www.statmt.org/moses/?n=FactoredTraining.BuildReorderingModel
Question 3/Param 3: 6? This is an example of a target factor? It
seems different than the source factor example of 0-1.
This is the number of factors in the model. It should correspond to the
number of scores in the model file. In this case it is 6 since it is an
msd-model, with three scores for each direction, since it is also
bidirectional.
Question 4/Param 4: What is this file? Is it the base filename that
corresponds to the output of processLexicalTable, or maybe it's the
lexical reordering scoring table (i.e., the output of having run
scripts/training/lexical-reordering/score.cpp?)
This is the filename of the reordering model, which is the same as the
lexical reordering scoring table. This type of filename is the result of
running train-model.perl. But as you say, it is the file that comes out
of scripts/training/lexical-reordering/score.cpp. It could also have
been binarized by running processLexicalTable afterwards.
Any other insights and comments would be much appreciated.
Thanks!
--
Chris Tanner
/Sara
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support