the section
   [non-terminals]
   X
is only used for OOV handling, you don't need to fill it out with all the non-terminals. In fact, we trying to get rid of it & replace it with a list of non-terminals from a file.

the phrase extraction script should create 2 additional files, beside the phrase table:
    1. unknown-word-label. This looks like
            ADJA 0.110016
            NE 0.0785407
            NN 0.680048
For OOV words, a new rule is created with these as the left hand side.
    2. glue-grammar. This looks like
<s> [X] ||| <s> [Q] |||  ||| 1
        [X][Q] </s> [X] ||| [X][Q] </s> [Q] ||| 0-0 ||| 1
        [X][Q] [X][AA] [X] ||| [X][Q] [X][AA] [Q] ||| 0-0 1-1 ||| 2.718
        .
        .
The rules in the glue grammar has to be able to clip onto the start & end-of-sentence symbols, <s> & </s>. And there must be a glue rule for every single non-terminal type. Hopefully, you'll get what I mean if you look thru the file...

Attached is phil's ini file, unknown word & glue grammar files for his english-german.

On 23/06/2010 17:14, Lucia Specia wrote:
Hi again,

I notice that I should probably have all non-terminals listed in the moses.ini file. I only have the default 'X' there. Is this something that has to be done or should the training script take care of it? I have many non-terminals, they are all represented following the specification on the website, e.g.:

<tree label="S"> <tree label="FCL"> <tree label="NP"> ....

Regarding the glue grammar, could you please give me some pointers on what exactly has to be on it and what the format should be?

Thanks a lot,

Lucia


_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support
ADJA 0.110016
NE 0.0785407
NN 0.680048
<s> [X] ||| <s> [Q] |||  ||| 1
[X][Q] </s> [X] ||| [X][Q] </s> [Q] ||| 0-0 ||| 1
<s> [X][ADJA] </s> [X] ||| <s> [X][ADJA] </s> [Q] ||| 1-1 ||| 1
<s> [X][ADV] </s> [X] ||| <s> [X][ADV] </s> [Q] ||| 1-1 ||| 1
<s> [X][AP] </s> [X] ||| <s> [X][AP] </s> [Q] ||| 1-1 ||| 1
<s> [X][CH] </s> [X] ||| <s> [X][CH] </s> [Q] ||| 1-1 ||| 1
<s> [X][CNP] </s> [X] ||| <s> [X][CNP] </s> [Q] ||| 1-1 ||| 1
<s> [X][CO] </s> [X] ||| <s> [X][CO] </s> [Q] ||| 1-1 ||| 1
<s> [X][CPP] </s> [X] ||| <s> [X][CPP] </s> [Q] ||| 1-1 ||| 1
<s> [X][CS] </s> [X] ||| <s> [X][CS] </s> [Q] ||| 1-1 ||| 1
<s> [X][CVP] </s> [X] ||| <s> [X][CVP] </s> [Q] ||| 1-1 ||| 1
<s> [X][DL] </s> [X] ||| <s> [X][DL] </s> [Q] ||| 1-1 ||| 1
<s> [X][NE] </s> [X] ||| <s> [X][NE] </s> [Q] ||| 1-1 ||| 1
<s> [X][NN] </s> [X] ||| <s> [X][NN] </s> [Q] ||| 1-1 ||| 1
<s> [X][NP] </s> [X] ||| <s> [X][NP] </s> [Q] ||| 1-1 ||| 1
<s> [X][PN] </s> [X] ||| <s> [X][PN] </s> [Q] ||| 1-1 ||| 1
<s> [X][PP] </s> [X] ||| <s> [X][PP] </s> [Q] ||| 1-1 ||| 1
<s> [X][PUNC.] </s> [X] ||| <s> [X][PUNC.] </s> [Q] ||| 1-1 ||| 1
<s> [X][S] </s> [X] ||| <s> [X][S] </s> [Q] ||| 1-1 ||| 1
<s> [X][TOP] </s> [X] ||| <s> [X][TOP] </s> [Q] ||| 1-1 ||| 1
<s> [X][VP] </s> [X] ||| <s> [X][VP] </s> [Q] ||| 1-1 ||| 1
<s> [X][XY] </s> [X] ||| <s> [X][XY] </s> [Q] ||| 1-1 ||| 1
[X][Q] [X][AA] [X] ||| [X][Q] [X][AA] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][ADJA] [X] ||| [X][Q] [X][ADJA] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][ADJD] [X] ||| [X][Q] [X][ADJD] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][ADV] [X] ||| [X][Q] [X][ADV] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][AP] [X] ||| [X][Q] [X][AP] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][APPO] [X] ||| [X][Q] [X][APPO] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][APPR] [X] ||| [X][Q] [X][APPR] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][APPRART] [X] ||| [X][Q] [X][APPRART] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][APZR] [X] ||| [X][Q] [X][APZR] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][ART] [X] ||| [X][Q] [X][ART] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][AVP] [X] ||| [X][Q] [X][AVP] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][CAC] [X] ||| [X][Q] [X][CAC] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][CAP] [X] ||| [X][Q] [X][CAP] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][CARD] [X] ||| [X][Q] [X][CARD] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][CAVP] [X] ||| [X][Q] [X][CAVP] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][CH] [X] ||| [X][Q] [X][CH] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][CNP] [X] ||| [X][Q] [X][CNP] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][CO] [X] ||| [X][Q] [X][CO] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][CPP] [X] ||| [X][Q] [X][CPP] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][CS] [X] ||| [X][Q] [X][CS] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][CVP] [X] ||| [X][Q] [X][CVP] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][CVZ] [X] ||| [X][Q] [X][CVZ] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][DL] [X] ||| [X][Q] [X][DL] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][FM] [X] ||| [X][Q] [X][FM] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][ISU] [X] ||| [X][Q] [X][ISU] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][ITJ] [X] ||| [X][Q] [X][ITJ] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][KOKOM] [X] ||| [X][Q] [X][KOKOM] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][KON] [X] ||| [X][Q] [X][KON] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][KOUI] [X] ||| [X][Q] [X][KOUI] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][KOUS] [X] ||| [X][Q] [X][KOUS] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][MTA] [X] ||| [X][Q] [X][MTA] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][NE] [X] ||| [X][Q] [X][NE] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][NM] [X] ||| [X][Q] [X][NM] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][NN] [X] ||| [X][Q] [X][NN] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][NNE] [X] ||| [X][Q] [X][NNE] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][NP] [X] ||| [X][Q] [X][NP] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][PDAT] [X] ||| [X][Q] [X][PDAT] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][PDS] [X] ||| [X][Q] [X][PDS] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][PIAT] [X] ||| [X][Q] [X][PIAT] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][PIS] [X] ||| [X][Q] [X][PIS] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][PN] [X] ||| [X][Q] [X][PN] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][PP] [X] ||| [X][Q] [X][PP] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][PPER] [X] ||| [X][Q] [X][PPER] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][PPOSAT] [X] ||| [X][Q] [X][PPOSAT] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][PPOSS] [X] ||| [X][Q] [X][PPOSS] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][PRELAT] [X] ||| [X][Q] [X][PRELAT] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][PRELS] [X] ||| [X][Q] [X][PRELS] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][PRF] [X] ||| [X][Q] [X][PRF] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][PROAV] [X] ||| [X][Q] [X][PROAV] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][PTKA] [X] ||| [X][Q] [X][PTKA] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][PTKANT] [X] ||| [X][Q] [X][PTKANT] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][PTKNEG] [X] ||| [X][Q] [X][PTKNEG] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][PTKVZ] [X] ||| [X][Q] [X][PTKVZ] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][PTKZU] [X] ||| [X][Q] [X][PTKZU] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][PUNC,] [X] ||| [X][Q] [X][PUNC,] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][PUNC.] [X] ||| [X][Q] [X][PUNC.] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][PUNCPar] [X] ||| [X][Q] [X][PUNCPar] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][PWAT] [X] ||| [X][Q] [X][PWAT] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][PWAV] [X] ||| [X][Q] [X][PWAV] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][PWS] [X] ||| [X][Q] [X][PWS] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][S] [X] ||| [X][Q] [X][S] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][TOP] [X] ||| [X][Q] [X][TOP] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][TRUNC] [X] ||| [X][Q] [X][TRUNC] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][VAFIN] [X] ||| [X][Q] [X][VAFIN] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][VAINF] [X] ||| [X][Q] [X][VAINF] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][VAPP] [X] ||| [X][Q] [X][VAPP] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][VMFIN] [X] ||| [X][Q] [X][VMFIN] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][VMINF] [X] ||| [X][Q] [X][VMINF] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][VMPP] [X] ||| [X][Q] [X][VMPP] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][VP] [X] ||| [X][Q] [X][VP] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][VVFIN] [X] ||| [X][Q] [X][VVFIN] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][VVIMP] [X] ||| [X][Q] [X][VVIMP] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][VVINF] [X] ||| [X][Q] [X][VVINF] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][VVIZU] [X] ||| [X][Q] [X][VVIZU] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][VVPP] [X] ||| [X][Q] [X][VVPP] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][VZ] [X] ||| [X][Q] [X][VZ] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][XY] [X] ||| [X][Q] [X][XY] [Q] ||| 0-0 1-1 ||| 2.718
[X][Q] [X][X] [X] ||| [X][Q] [X][X] [Q] ||| 0-0 1-1 ||| 2.718
#########################
### MOSES CONFIG FILE ###
#########################

# input factors
[input-factors]
0

# mapping steps
[mapping]
0 T 0
1 T 1

# translation tables: source-factors, target-factors, number of scores, file 
[ttable-file]
6 0 0 5 
/home/s0898777/experiments/wmt10-en-de-target-syntax/model/phrase-table.1
6 0 0 1 
/home/s0898777/experiments/wmt10-en-de-target-syntax/model/glue-grammar.1

# no generation models, no generation-file section

# language models: type(srilm/irstlm), factors, order, file
[lmodel-file]
0 0 5 /home/s0898777/experiments/wmt10-en-de-target-syntax/lm/interpolated-lm.1


# limit on how many phrase translations e for each phrase f are loaded
# 0 = all elements loaded
[ttable-limit]
20


# language model weights
[weight-l]
0.5000


# translation model weights
[weight-t]
0.2
0.2
0.2
0.2
0.2
1.0

# no generation models, no weight-generation section

# word penalty
[weight-w]
-1

[cube-pruning-pop-limit]
1000

[glue-rule-type]
0

[non-terminals]
X

[search-algorithm]
3

[inputtype]
3

[max-chart-span]
20
1000
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to