Hi Tom

On 02/02/15 01:33, Tom Hoar wrote:
Much of the v2 moses.ini looks self-explanatory, but I'd like to confirm my understanding.

The website (http://www.statmt.org/moses/?n=Moses.FeatureFunctions) defines three feature/functions without arguments. In the moses.ini files made by train-model.perl's step 9, there also appears to be a 4th that requires no argument. Can someone confirm this is the case? Are there others that could appear without arguments?
Yes, PhrasePenalty is a standard feature function (FF), it doesn't need any arguments. It used to be the constant 2.718 in the last score in the phrase-table. But i thought that was silly so move it to it's own feature function.

    [feature]
    UnknownWordPenalty
    WordPenalty
    Distortion
    PhrasePenalty * - not listed on the website (are there more)

Feature/functions in the [feature] section and items in the [weight] sections appear to be linked. The feature/functions without arguments have corresponding entries linked by the same option name with an appended zero in the [weight] section. Since these feature/functions have arguments, is it safe to say that they can appear only once in both the [feature] and [weight] sections?
You can have multiple feature function of the same type. The most obvious 1 would be having multiple LM, eg.
   [feature]
   KENLM path=file1.lm
   KENLM path=file2.lm
Each instance of an FF must have a unique name, if you don't give inamet a , the decoder will name it for you, KENLM0, KENLM1, .... Each instance must have the corresponding weights (unless it's non-tuneable)
   [weight]
   KENLM0= 0.4
   KENLM1= 0.6

    [weight]
    UnknownWordPenalty0= 1
    WordPenalty0= -1
    Distortion0= 0.3
    PhrasePenalty0= 0.2

The feature/functions arguments have corresponding entries liked by the "name=" argument as the option name in the [weight] section. Are there cases where there will be entries in the [feature] section without corresponding entries in the [weight] section or vice-versa?
Yes, if the FF is not tuneable, you don't need to give it weight(s). The hardcoded weights will be used. UnknownWordPenalty is a non-tuneable FF and its hardcoded weight is 1, so the line
   UnknownWordPenalty0= 1
isn't strictly necessary.

Whether a FF is non-tuneable or not by default is determined in the code. You can also set the non-tuneable property, eg
    [feature]
    UnknownWordPenalty tuneable=true


    [feature]
    PhraseDictionaryMemory name=*TranslationModel0* num-features=4 ...
    KENLM name=*LM0* factor=0 ...

    [weight]
*TranslationModel0*= 0.2 0.2 0.2 0.2
*LM0*= 0.5

The sections other than [feature] and [weight], such as [input-factors] and [mapping], appear to preserve the v1 moses.ini format. Is this true?
yep

The order of lines in the [feature] and [weight] sections is irrelevant (as many examples have them in different orders). Also, the order of the arguments on a feature/function line is irrelevant (examples show them in different orders).
yep. However, the relative order of the phrase-tables is relevant. The [mapping] section refers to the index of the phrase-table, eg
   [mapping]
   0 T 0
   1 T 1
so if you swap the order of the phrase-tables in the [feature] section, they will refer to different phrase-tables


Finally, is there a connection between the [input-factors] section's value and the input-factor argument value for PhraseDictionaryMemory and LexicalReordering feature/functions? Or, are the similar names and corresponding values only coincidental?
Your input sentence can contain multiple factors, eg. surface|POS|lemma, in which case [input-factor] should be
  [input-factor]
  0
  1
  2
However, the phase-table may choose to only use the surface form, in which case
  [feature]
  PhraseDictionaryMemory input-factors=0

My intention is to build two scripts and contribute these scripts to the Moses project. One will convert the v2 moses.ini file to a standard form (not associated with the command line syntax) so people can easily edit the values. The other will convert the interim form back to the native v2 moses.ini format.


_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to