-------- Forwarding messages -------- From: dongxinghua0213 <[email protected]> Date: 2012-03-13 20:27:47 To: "Philip Williams" <[email protected]> Subject: Re:Re: [Moses-support] the glue grammar for string-to-tree model Thank you very much , another questions , in tree model , in a span ,do the source and target head labels be saved the the data structure of translation hypothesis and we have to check them for a longer span to see whether they match a rule in rule table? and the glue rule : [X][S] [X][X] [X] ||| [X][S] [X][X] [S] ||| 0-0 1-1 ||| 2.718 when the decoder can't find the rules for a span in rule table , we use it , but there are many combinations, for example , a span [0,5], there are combinations : [0,2] [2,5] [0,3][3,5} [0,1][1,4][4,5] and etc., could the decoder try every one and will the glue rule be used frequently ? At 2012-03-13 19:50:53,"Philip Williams" <[email protected]> wrote: >Hi, > >yes, the additional string-to-tree glue rules are straightforward to derive >from the target-side parse trees. They come in two varieties: there are >top-level rules like this: > ><s> [X][S] </s> [X] ||| <s> [X][S] </s> [Q] ||| 1-1 ||| 1 > >With one for each target constituent label that covers the entire span of a >training set parse tree (the top-level label and if there's a unary chain of >descendants then those labels as well). > >The second type allow monotonic combination of a glue constituent with any >other constituent. There's one for every target label in the training data. >For example, > >[X][Q] [X][DT] [X] ||| [X][Q] [X][DT] [Q] ||| 0-0 1-1 ||| 2.718 > >Oh, and there's one more: > >[X][Q] [X][X] [X] ||| [X][Q] [X][X] [Q] ||| 0-0 1-1 ||| 2.718 > >Which is used for unknown words. (To handle unknown words, the decoder can >apply a generated rule that produces an X constituent on the target side.) > >In Moses, it's the extract-rules program that produce the glue grammars (or >extract-ghkm if you use GHKM). > >Phil > > >On 12 Mar 2012, at 11:31, dongxinghua0213 wrote: > >> hi , >> >> for hierarchical phrase-based model and tree-to-string model, there few >> glue grammars, such as : >> >> <s> [X] ||| <s> [S] ||| ||| 1 >> [X][S] </s> [X] ||| [X][S] </s> [S] ||| 0-0 ||| 1 >> [X][S] [X][X] [X] ||| [X][S] [X][X] [S] ||| 0-0 1-1 ||| 2.718 >> >> >> It is well-understood, but for string-to-tree model , there are many glue >> grammars , you can see them like this : >> >> <s> [X] ||| <s> [Q] ||| ||| 1 >> [X][Q] </s> [X] ||| [X][Q] </s> [Q] ||| 0-0 ||| 1 >> <s> [X][ADJP] </s> [X] ||| <s> [X][ADJP] </s> [Q] ||| 1-1 ||| 1 >> <s> [X][ADVP] </s> [X] ||| <s> [X][ADVP] </s> [Q] ||| 1-1 ||| 1 >> <s> [X][FRAG] </s> [X] ||| <s> [X][FRAG] </s> [Q] ||| 1-1 ||| 1 >> <s> [X][INTJ] </s> [X] ||| <s> [X][INTJ] </s> [Q] ||| 1-1 ||| 1 >> <s> [X][JJ] </s> [X] ||| <s> [X][JJ] </s> [Q] ||| 1-1 ||| 1 >> <s> [X][JJR] </s> [X] ||| <s> [X][JJR] </s> [Q] ||| 1-1 ||| 1 >> <s> [X][NN] </s> [X] ||| <s> [X][NN] </s> [Q] ||| 1-1 ||| 1 >> <s> [X][NNP] </s> [X] ||| <s> [X][NNP] </s> [Q] ||| 1-1 ||| 1 >> <s> [X][NNPS] </s> [X] ||| <s> [X][NNPS] </s> [Q] ||| 1-1 ||| 1 >> <s> [X][NNS] </s> [X] ||| <s> [X][NNS] </s> [Q] ||| 1-1 ||| 1 >> <s> [X][NP] </s> [X] ||| <s> [X][NP] </s> [Q] ||| 1-1 ||| 1 >> <s> [X][NP-A] </s> [X] ||| <s> [X][NP-A] </s> [Q] ||| 1-1 ||| 1 >> <s> [X][NPB] </s> [X] ||| <s> [X][NPB] </s> [Q] ||| 1-1 ||| 1 >> <s> [X][PP] </s> [X] ||| <s> [X][PP] </s> [Q] ||| 1-1 ||| 1 >> <s> [X][PRN] </s> [X] ||| <s> [X][PRN] </s> [Q] ||| 1-1 ||| 1 >> <s> [X][RBR] </s> [X] ||| <s> [X][RBR] </s> [Q] ||| 1-1 ||| 1 >> <s> [X][S] </s> [X] ||| <s> [X][S] </s> [Q] ||| 1-1 ||| 1 >> <s> [X][S-A] </s> [X] ||| <s> [X][S-A] </s> [Q] ||| 1-1 ||| 1 >> <s> [X][SBAR] </s> [X] ||| <s> [X][SBAR] </s> [Q] ||| 1-1 ||| 1 >> <s> [X][SBARQ] </s> [X] ||| <s> [X][SBARQ] </s> [Q] ||| 1-1 ||| 1 >> <s> [X][SG] </s> [X] ||| <s> [X][SG] </s> [Q] ||| 1-1 ||| 1 >> <s> [X][SINV] </s> [X] ||| <s> [X][SINV] </s> [Q] ||| 1-1 ||| 1 >> <s> [X][SQ] </s> [X] ||| <s> [X][SQ] </s> [Q] ||| 1-1 ||| 1 >> <s> [X][TOP] </s> [X] ||| <s> [X][TOP] </s> [Q] ||| 1-1 ||| 1 >> <s> [X][UCP] </s> [X] ||| <s> [X][UCP] </s> [Q] ||| 1-1 ||| 1 >> <s> [X][UH] </s> [X] ||| <s> [X][UH] </s> [Q] ||| 1-1 ||| 1 >> <s> [X][VB] </s> [X] ||| <s> [X][VB] </s> [Q] ||| 1-1 ||| 1 >> <s> [X][VBG] </s> [X] ||| <s> [X][VBG] </s> [Q] ||| 1-1 ||| 1 >> <s> [X][VP] </s> [X] ||| <s> [X][VP] </s> [Q] ||| 1-1 ||| 1 >> <s> [X][WHADJP] </s> [X] ||| <s> [X][WHADJP] </s> [Q] ||| 1-1 ||| 1 >> <s> [X][WHADVP] </s> [X] ||| <s> [X][WHADVP] </s> [Q] ||| 1-1 ||| 1 >> <s> [X][X] </s> [X] ||| <s> [X][X] </s> [Q] ||| 1-1 ||| 1 >> [X][Q] [X][$] [X] ||| [X][Q] [X][$] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][''] [X] ||| [X][Q] [X][''] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][,] [X] ||| [X][Q] [X][,] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][.] [X] ||| [X][Q] [X][.] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][:] [X] ||| [X][Q] [X][:] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][ADJP] [X] ||| [X][Q] [X][ADJP] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][ADJP-A] [X] ||| [X][Q] [X][ADJP-A] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][ADVP] [X] ||| [X][Q] [X][ADVP] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][ADVP-A] [X] ||| [X][Q] [X][ADVP-A] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][CC] [X] ||| [X][Q] [X][CC] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][CD] [X] ||| [X][Q] [X][CD] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][CONJP] [X] ||| [X][Q] [X][CONJP] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][DT] [X] ||| [X][Q] [X][DT] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][EX] [X] ||| [X][Q] [X][EX] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][FRAG] [X] ||| [X][Q] [X][FRAG] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][FW] [X] ||| [X][Q] [X][FW] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][IN] [X] ||| [X][Q] [X][IN] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][INTJ] [X] ||| [X][Q] [X][INTJ] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][JJ] [X] ||| [X][Q] [X][JJ] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][JJR] [X] ||| [X][Q] [X][JJR] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][JJS] [X] ||| [X][Q] [X][JJS] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][LRB] [X] ||| [X][Q] [X][LRB] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][LS] [X] ||| [X][Q] [X][LS] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][MD] [X] ||| [X][Q] [X][MD] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][NAC] [X] ||| [X][Q] [X][NAC] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][NN] [X] ||| [X][Q] [X][NN] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][NNP] [X] ||| [X][Q] [X][NNP] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][NNPS] [X] ||| [X][Q] [X][NNPS] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][NNS] [X] ||| [X][Q] [X][NNS] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][NP] [X] ||| [X][Q] [X][NP] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][NP-A] [X] ||| [X][Q] [X][NP-A] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][NPB] [X] ||| [X][Q] [X][NPB] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][NX] [X] ||| [X][Q] [X][NX] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][PDT] [X] ||| [X][Q] [X][PDT] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][POS] [X] ||| [X][Q] [X][POS] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][PP] [X] ||| [X][Q] [X][PP] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][PP-A] [X] ||| [X][Q] [X][PP-A] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][PRN] [X] ||| [X][Q] [X][PRN] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][PRP] [X] ||| [X][Q] [X][PRP] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][PRP$] [X] ||| [X][Q] [X][PRP$] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][PRT] [X] ||| [X][Q] [X][PRT] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][PUNC''] [X] ||| [X][Q] [X][PUNC''] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][PUNC,] [X] ||| [X][Q] [X][PUNC,] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][PUNC.] [X] ||| [X][Q] [X][PUNC.] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][PUNC:] [X] ||| [X][Q] [X][PUNC:] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][PUNC``] [X] ||| [X][Q] [X][PUNC``] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][QP] [X] ||| [X][Q] [X][QP] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][RB] [X] ||| [X][Q] [X][RB] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][RBR] [X] ||| [X][Q] [X][RBR] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][RBS] [X] ||| [X][Q] [X][RBS] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][RP] [X] ||| [X][Q] [X][RP] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][RRB] [X] ||| [X][Q] [X][RRB] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][RRC] [X] ||| [X][Q] [X][RRC] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][S] [X] ||| [X][Q] [X][S] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][S-A] [X] ||| [X][Q] [X][S-A] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][SBAR] [X] ||| [X][Q] [X][SBAR] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][SBAR-A] [X] ||| [X][Q] [X][SBAR-A] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][SBARQ] [X] ||| [X][Q] [X][SBARQ] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][SG] [X] ||| [X][Q] [X][SG] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][SG-A] [X] ||| [X][Q] [X][SG-A] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][SINV] [X] ||| [X][Q] [X][SINV] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][SQ] [X] ||| [X][Q] [X][SQ] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][TO] [X] ||| [X][Q] [X][TO] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][TOP] [X] ||| [X][Q] [X][TOP] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][UCP] [X] ||| [X][Q] [X][UCP] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][UH] [X] ||| [X][Q] [X][UH] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][VB] [X] ||| [X][Q] [X][VB] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][VBD] [X] ||| [X][Q] [X][VBD] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][VBG] [X] ||| [X][Q] [X][VBG] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][VBN] [X] ||| [X][Q] [X][VBN] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][VBP] [X] ||| [X][Q] [X][VBP] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][VBZ] [X] ||| [X][Q] [X][VBZ] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][VP] [X] ||| [X][Q] [X][VP] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][VP-A] [X] ||| [X][Q] [X][VP-A] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][WDT] [X] ||| [X][Q] [X][WDT] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][WHADJP] [X] ||| [X][Q] [X][WHADJP] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][WHADVP] [X] ||| [X][Q] [X][WHADVP] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][WHNP] [X] ||| [X][Q] [X][WHNP] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][WHNP-A] [X] ||| [X][Q] [X][WHNP-A] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][WHPP] [X] ||| [X][Q] [X][WHPP] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][WP] [X] ||| [X][Q] [X][WP] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][WP$] [X] ||| [X][Q] [X][WP$] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][WRB] [X] ||| [X][Q] [X][WRB] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][X] [X] ||| [X][Q] [X][X] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][``] [X] ||| [X][Q] [X][``] [Q] ||| 0-0 1-1 ||| 2.718 >> [X][Q] [X][X] [X] ||| [X][Q] [X][X] [Q] ||| 0-0 1-1 ||| 2.718 >> >> where are they from ? Could they be extracted from annotated target >> side parallel texts ? >> >> >> _______________________________________________ >> Moses-support mailing list >> [email protected] >> http://mailman.mit.edu/mailman/listinfo/moses-support >
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
