i don't really understand your questions.
yes, non-terminal labels are saved. You can print out the labels by add
this argument to the decoder:
-T [file]
yes, the decoder tries every span. The best hypothesis for the entire
sentence will be returned at the end. A good introduction to the CKY+
algorithm may be Phillip Koehn's book or Adam Lopez's phd thesis
-------- Forwarding messages --------
From: dongxinghua0213 <[email protected]>
Date: 2012-03-13 20:27:47
To: "Philip Williams" <[email protected]>
Subject: Re:Re: [Moses-support] the glue grammar for string-to-tree model
Thank you very much , another questions , in tree model , in a
span ,do the source and target head labels be saved the the data
structure of translation hypothesis and we have to check them for a
longer span to see whether they match a rule in rule table?
and the glue rule :
[X][S] [X][X] [X] ||| [X][S] [X][X] [S] ||| 0-0 1-1 ||| 2.718
when the decoder can't find the rules for a span in rule table , we
use it ,
but there are many combinations, for example , a span [0,5], there
are combinations :
[0,2] [2,5]
[0,3][3,5}
[0,1][1,4][4,5]
and etc., could the decoder try every one and will the glue rule be
used frequently ?
At 2012-03-13 19:50:53,"Philip Williams"<[email protected]
<mailto:[email protected]>> wrote:
>Hi,
>
>yes, the additional string-to-tree glue rules are straightforward to derive
from the target-side parse trees. They come in two varieties: there are top-level
rules like this:
>
><s> [X][S]</s> [X] |||<s> [X][S]</s> [Q] ||| 1-1 ||| 1
>
>With one for each target constituent label that covers the entire span of a
training set parse tree (the top-level label and if there's a unary chain of
descendants then those labels as well).
>
>The second type allow monotonic combination of a glue constituent with any
other constituent. There's one for every target label in the training data. For
example,
>
>[X][Q] [X][DT] [X] ||| [X][Q] [X][DT] [Q] ||| 0-0 1-1 ||| 2.718
>
>Oh, and there's one more:
>
>[X][Q] [X][X] [X] ||| [X][Q] [X][X] [Q] ||| 0-0 1-1 ||| 2.718
>
>Which is used for unknown words. (To handle unknown words, the decoder can
apply a generated rule that produces an X constituent on the target side.)
>
>In Moses, it's the extract-rules program that produce the glue grammars (or
extract-ghkm if you use GHKM).
>
>Phil
>
>
>On 12 Mar 2012, at 11:31, dongxinghua0213 wrote:
>
>> hi ,
>>
>> for hierarchical phrase-based model and tree-to-string model, there few
glue grammars, such as :
>>
>> <s> [X] |||<s> [S] ||| ||| 1
>> [X][S]</s> [X] ||| [X][S]</s> [S] ||| 0-0 ||| 1
>> [X][S] [X][X] [X] ||| [X][S] [X][X] [S] ||| 0-0 1-1 ||| 2.718
>>
>>
>> It is well-understood, but for string-to-tree model , there are many
glue grammars , you can see them like this :
>>
>> <s> [X] |||<s> [Q] ||| ||| 1
>> [X][Q]</s> [X] ||| [X][Q]</s> [Q] ||| 0-0 ||| 1
>> <s> [X][ADJP]</s> [X] |||<s> [X][ADJP]</s> [Q] ||| 1-1 ||| 1
>> <s> [X][ADVP]</s> [X] |||<s> [X][ADVP]</s> [Q] ||| 1-1 ||| 1
>> <s> [X][FRAG]</s> [X] |||<s> [X][FRAG]</s> [Q] ||| 1-1 ||| 1
>> <s> [X][INTJ]</s> [X] |||<s> [X][INTJ]</s> [Q] ||| 1-1 ||| 1
>> <s> [X][JJ]</s> [X] |||<s> [X][JJ]</s> [Q] ||| 1-1 ||| 1
>> <s> [X][JJR]</s> [X] |||<s> [X][JJR]</s> [Q] ||| 1-1 ||| 1
>> <s> [X][NN]</s> [X] |||<s> [X][NN]</s> [Q] ||| 1-1 ||| 1
>> <s> [X][NNP]</s> [X] |||<s> [X][NNP]</s> [Q] ||| 1-1 ||| 1
>> <s> [X][NNPS]</s> [X] |||<s> [X][NNPS]</s> [Q] ||| 1-1 ||| 1
>> <s> [X][NNS]</s> [X] |||<s> [X][NNS]</s> [Q] ||| 1-1 ||| 1
>> <s> [X][NP]</s> [X] |||<s> [X][NP]</s> [Q] ||| 1-1 ||| 1
>> <s> [X][NP-A]</s> [X] |||<s> [X][NP-A]</s> [Q] ||| 1-1 ||| 1
>> <s> [X][NPB]</s> [X] |||<s> [X][NPB]</s> [Q] ||| 1-1 ||| 1
>> <s> [X][PP]</s> [X] |||<s> [X][PP]</s> [Q] ||| 1-1 ||| 1
>> <s> [X][PRN]</s> [X] |||<s> [X][PRN]</s> [Q] ||| 1-1 ||| 1
>> <s> [X][RBR]</s> [X] |||<s> [X][RBR]</s> [Q] ||| 1-1 ||| 1
>> <s> [X][S]</s> [X] |||<s> [X][S]</s> [Q] ||| 1-1 ||| 1
>> <s> [X][S-A]</s> [X] |||<s> [X][S-A]</s> [Q] ||| 1-1 ||| 1
>> <s> [X][SBAR]</s> [X] |||<s> [X][SBAR]</s> [Q] ||| 1-1 ||| 1
>> <s> [X][SBARQ]</s> [X] |||<s> [X][SBARQ]</s> [Q] ||| 1-1 ||| 1
>> <s> [X][SG]</s> [X] |||<s> [X][SG]</s> [Q] ||| 1-1 ||| 1
>> <s> [X][SINV]</s> [X] |||<s> [X][SINV]</s> [Q] ||| 1-1 ||| 1
>> <s> [X][SQ]</s> [X] |||<s> [X][SQ]</s> [Q] ||| 1-1 ||| 1
>> <s> [X][TOP]</s> [X] |||<s> [X][TOP]</s> [Q] ||| 1-1 ||| 1
>> <s> [X][UCP]</s> [X] |||<s> [X][UCP]</s> [Q] ||| 1-1 ||| 1
>> <s> [X][UH]</s> [X] |||<s> [X][UH]</s> [Q] ||| 1-1 ||| 1
>> <s> [X][VB]</s> [X] |||<s> [X][VB]</s> [Q] ||| 1-1 ||| 1
>> <s> [X][VBG]</s> [X] |||<s> [X][VBG]</s> [Q] ||| 1-1 ||| 1
>> <s> [X][VP]</s> [X] |||<s> [X][VP]</s> [Q] ||| 1-1 ||| 1
>> <s> [X][WHADJP]</s> [X] |||<s> [X][WHADJP]</s> [Q] ||| 1-1 ||| 1
>> <s> [X][WHADVP]</s> [X] |||<s> [X][WHADVP]</s> [Q] ||| 1-1 ||| 1
>> <s> [X][X]</s> [X] |||<s> [X][X]</s> [Q] ||| 1-1 ||| 1
>> [X][Q] [X][$] [X] ||| [X][Q] [X][$] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][''] [X] ||| [X][Q] [X][''] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][,] [X] ||| [X][Q] [X][,] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][.] [X] ||| [X][Q] [X][.] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][:] [X] ||| [X][Q] [X][:] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][ADJP] [X] ||| [X][Q] [X][ADJP] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][ADJP-A] [X] ||| [X][Q] [X][ADJP-A] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][ADVP] [X] ||| [X][Q] [X][ADVP] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][ADVP-A] [X] ||| [X][Q] [X][ADVP-A] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][CC] [X] ||| [X][Q] [X][CC] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][CD] [X] ||| [X][Q] [X][CD] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][CONJP] [X] ||| [X][Q] [X][CONJP] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][DT] [X] ||| [X][Q] [X][DT] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][EX] [X] ||| [X][Q] [X][EX] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][FRAG] [X] ||| [X][Q] [X][FRAG] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][FW] [X] ||| [X][Q] [X][FW] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][IN] [X] ||| [X][Q] [X][IN] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][INTJ] [X] ||| [X][Q] [X][INTJ] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][JJ] [X] ||| [X][Q] [X][JJ] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][JJR] [X] ||| [X][Q] [X][JJR] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][JJS] [X] ||| [X][Q] [X][JJS] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][LRB] [X] ||| [X][Q] [X][LRB] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][LS] [X] ||| [X][Q] [X][LS] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][MD] [X] ||| [X][Q] [X][MD] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][NAC] [X] ||| [X][Q] [X][NAC] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][NN] [X] ||| [X][Q] [X][NN] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][NNP] [X] ||| [X][Q] [X][NNP] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][NNPS] [X] ||| [X][Q] [X][NNPS] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][NNS] [X] ||| [X][Q] [X][NNS] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][NP] [X] ||| [X][Q] [X][NP] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][NP-A] [X] ||| [X][Q] [X][NP-A] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][NPB] [X] ||| [X][Q] [X][NPB] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][NX] [X] ||| [X][Q] [X][NX] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][PDT] [X] ||| [X][Q] [X][PDT] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][POS] [X] ||| [X][Q] [X][POS] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][PP] [X] ||| [X][Q] [X][PP] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][PP-A] [X] ||| [X][Q] [X][PP-A] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][PRN] [X] ||| [X][Q] [X][PRN] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][PRP] [X] ||| [X][Q] [X][PRP] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][PRP$] [X] ||| [X][Q] [X][PRP$] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][PRT] [X] ||| [X][Q] [X][PRT] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][PUNC''] [X] ||| [X][Q] [X][PUNC''] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][PUNC,] [X] ||| [X][Q] [X][PUNC,] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][PUNC.] [X] ||| [X][Q] [X][PUNC.] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][PUNC:] [X] ||| [X][Q] [X][PUNC:] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][PUNC``] [X] ||| [X][Q] [X][PUNC``] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][QP] [X] ||| [X][Q] [X][QP] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][RB] [X] ||| [X][Q] [X][RB] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][RBR] [X] ||| [X][Q] [X][RBR] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][RBS] [X] ||| [X][Q] [X][RBS] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][RP] [X] ||| [X][Q] [X][RP] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][RRB] [X] ||| [X][Q] [X][RRB] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][RRC] [X] ||| [X][Q] [X][RRC] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][S] [X] ||| [X][Q] [X][S] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][S-A] [X] ||| [X][Q] [X][S-A] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][SBAR] [X] ||| [X][Q] [X][SBAR] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][SBAR-A] [X] ||| [X][Q] [X][SBAR-A] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][SBARQ] [X] ||| [X][Q] [X][SBARQ] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][SG] [X] ||| [X][Q] [X][SG] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][SG-A] [X] ||| [X][Q] [X][SG-A] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][SINV] [X] ||| [X][Q] [X][SINV] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][SQ] [X] ||| [X][Q] [X][SQ] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][TO] [X] ||| [X][Q] [X][TO] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][TOP] [X] ||| [X][Q] [X][TOP] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][UCP] [X] ||| [X][Q] [X][UCP] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][UH] [X] ||| [X][Q] [X][UH] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][VB] [X] ||| [X][Q] [X][VB] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][VBD] [X] ||| [X][Q] [X][VBD] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][VBG] [X] ||| [X][Q] [X][VBG] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][VBN] [X] ||| [X][Q] [X][VBN] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][VBP] [X] ||| [X][Q] [X][VBP] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][VBZ] [X] ||| [X][Q] [X][VBZ] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][VP] [X] ||| [X][Q] [X][VP] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][VP-A] [X] ||| [X][Q] [X][VP-A] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][WDT] [X] ||| [X][Q] [X][WDT] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][WHADJP] [X] ||| [X][Q] [X][WHADJP] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][WHADVP] [X] ||| [X][Q] [X][WHADVP] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][WHNP] [X] ||| [X][Q] [X][WHNP] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][WHNP-A] [X] ||| [X][Q] [X][WHNP-A] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][WHPP] [X] ||| [X][Q] [X][WHPP] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][WP] [X] ||| [X][Q] [X][WP] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][WP$] [X] ||| [X][Q] [X][WP$] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][WRB] [X] ||| [X][Q] [X][WRB] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][X] [X] ||| [X][Q] [X][X] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][``] [X] ||| [X][Q] [X][``] [Q] ||| 0-0 1-1 ||| 2.718
>> [X][Q] [X][X] [X] ||| [X][Q] [X][X] [Q] ||| 0-0 1-1 ||| 2.718
>>
>> where are they from ? Could they be extracted from annotated target
side parallel texts ?
>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> [email protected] <mailto:[email protected]>
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support