cheer tom
i've add your script to convert v2->v1.
https://github.com/moses-smt/mosesdecoder/commit/78f79632b9a87c72f4ad11005359fcdc57d1c0bc
There's already a script to convert it the other way, so we're keeping
that instead
On 02/02/15 15:20, Tom Hoar wrote:
Thanks, Hieu.
Here are the two scripts. Add them to the contrib (or other) folder as
you see fit. They have a -h switch. The moses2-to-ini.py script copies
everything from moses.ini v2 format into a traditional INI file
format. The ini-2-moses2.py script restores the moses.ini v2 format.
They change the line order in the [feature] and [weight] sections. The
overall order of section blocks can change. They preserve the order of
lines/values inside the old v1 moses.ini format sections, like
[input-factor], [mapping], etc. Please report any errors.
There two limits. First, the FF name= attribute is not optional
because relying on sequential orders can be risky. Users should
declare the name to match the weight value key. Second, they do not
support use cases where non-tuneable FF does not include weights. If
you place dummy weights, like you did with UnknownWordPenalty,
everything is ok.
There's one additional feature. Users can use the --escape-prefix
command-line option to escape a prefix path in FF "path=" attribute
values. This will be helpful in moving models. I noticed that the
clone_moses_model.pl script has not been updated to support the new
moses.ini file format. Maybe someone would like to use this as a
starting point.
I find it easier/faster to manipulate values a traditional INI file
structure. Users can also import classes in the scripts into other
modules. They have a pretty simple API. I thought others might find
these useful.
Tom
On 02/02/2015 08:56 PM, Hieu Hoang wrote:
Hi Tom
On 02/02/15 01:33, Tom Hoar wrote:
Much of the v2 moses.ini looks self-explanatory, but I'd like to
confirm my understanding.
The website (http://www.statmt.org/moses/?n=Moses.FeatureFunctions)
defines three feature/functions without arguments. In the moses.ini
files made by train-model.perl's step 9, there also appears to be a
4th that requires no argument. Can someone confirm this is the case?
Are there others that could appear without arguments?
Yes, PhrasePenalty is a standard feature function (FF), it doesn't
need any arguments. It used to be the constant 2.718 in the last
score in the phrase-table. But i thought that was silly so move it to
it's own feature function.
[feature]
UnknownWordPenalty
WordPenalty
Distortion
PhrasePenalty * - not listed on the website (are there more)
Feature/functions in the [feature] section and items in the [weight]
sections appear to be linked. The feature/functions without
arguments have corresponding entries linked by the same option name
with an appended zero in the [weight] section. Since these
feature/functions have arguments, is it safe to say that they can
appear only once in both the [feature] and [weight] sections?
You can have multiple feature function of the same type. The most
obvious 1 would be having multiple LM, eg.
[feature]
KENLM path=file1.lm
KENLM path=file2.lm
Each instance of an FF must have a unique name, if you don't give
inamet a , the decoder will name it for you, KENLM0, KENLM1, ....
Each instance must have the corresponding weights (unless it's
non-tuneable)
[weight]
KENLM0= 0.4
KENLM1= 0.6
[weight]
UnknownWordPenalty0= 1
WordPenalty0= -1
Distortion0= 0.3
PhrasePenalty0= 0.2
The feature/functions arguments have corresponding entries liked by
the "name=" argument as the option name in the [weight] section. Are
there cases where there will be entries in the [feature] section
without corresponding entries in the [weight] section or vice-versa?
Yes, if the FF is not tuneable, you don't need to give it weight(s).
The hardcoded weights will be used. UnknownWordPenalty is a
non-tuneable FF and its hardcoded weight is 1, so the line
UnknownWordPenalty0= 1
isn't strictly necessary.
Whether a FF is non-tuneable or not by default is determined in the
code. You can also set the non-tuneable property, eg
[feature]
UnknownWordPenalty tuneable=true
[feature]
PhraseDictionaryMemory name=*TranslationModel0* num-features=4 ...
KENLM name=*LM0* factor=0 ...
[weight]
*TranslationModel0*= 0.2 0.2 0.2 0.2
*LM0*= 0.5
The sections other than [feature] and [weight], such as
[input-factors] and [mapping], appear to preserve the v1 moses.ini
format. Is this true?
yep
The order of lines in the [feature] and [weight] sections is
irrelevant (as many examples have them in different orders). Also,
the order of the arguments on a feature/function line is irrelevant
(examples show them in different orders).
yep. However, the relative order of the phrase-tables is relevant.
The [mapping] section refers to the index of the phrase-table, eg
[mapping]
0 T 0
1 T 1
so if you swap the order of the phrase-tables in the [feature]
section, they will refer to different phrase-tables
Finally, is there a connection between the [input-factors] section's
value and the input-factor argument value for PhraseDictionaryMemory
and LexicalReordering feature/functions? Or, are the similar names
and corresponding values only coincidental?
Your input sentence can contain multiple factors, eg.
surface|POS|lemma, in which case [input-factor] should be
[input-factor]
0
1
2
However, the phase-table may choose to only use the surface form, in
which case
[feature]
PhraseDictionaryMemory input-factors=0
My intention is to build two scripts and contribute these scripts to
the Moses project. One will convert the v2 moses.ini file to a
standard form (not associated with the command line syntax) so
people can easily edit the values. The other will convert the
interim form back to the native v2 moses.ini format.
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support