Hi Hieu, Thanks! And sorry, I was confused. I thought that the trace file would show the source syntax symbols, but now I see that it's only displaying the target symbols. I'll try out your other suggestions as well and see if they give better results.
Graham On Mon, Oct 24, 2011 at 3:46 PM, Hieu Hoang <[email protected]> wrote: > hi graham, > > everything seems to be ok with the decoding. The decoding is using rules > other than glue rules. For instance, the range [3..6] is translated by a > rule from your translation table. > > However, it is true that no consecutive rules are joined by anything other > than the glue rule. That's often the case with syntax decoding. Also, you've > specified > --NonTermConsecSource > so you'll never get a rule like > VP --> [VB,1] [NP,2] # X --> [X,1] [X,2] > I think the argument is unnecessary for tree-to-string decoding and limits > some good translation rules that may be extracted. > > Also, the decoding algorithm in Moses separates the constraint of matching > the non-terminal labels between the translation rule and the input parse, > and the translation rule and it's child hypothesis. > > The tree-to-string as you have it only matches the non-term of the rule and > the input parse, the 'target' non-terminals are always X. > > The trace file only shows the non-terminal label of the 'target' side so you > don't see the source non-term. > > not sure if that makes sense > > On 23/10/2011 12:18, Graham Neubig wrote: > > Hi, > > I'm trying to train a tree-to-string model and have gotten gotten > train-model.perl to run properly, but it seems that when I perform > decoding the decoder is using nothing but the glue rules. My > train-model.perl command is as follows: > > $SCRIPTS_ROOTDIR/training/train-model.perl -scripts-root-dir > $SCRIPTS_ROOTDIR -root-dir work -corpus moses-0128.ja-en -f en -e ja > -alignment grow-diag-final-and -lm 0:3:`pwd`/ja-ja.arpa -source-syntax > -glue-grammar -extract-options "--MinWords 0 --MinHoleSource 1 > --MinWords 0 --NonTermConsecSource" &> err.txt > > Using this model, a decoded looks like this in Moses's output and the > trace file: > > Translating: <s> he revolutionized the japanese ink painting . </s> > ||| [0,0]=X (1) [0,1]=X (1) [0,2]=X (1) [0,3]=X (1) [0,4]=X (1) > [0,5]=X (1) [0,6]=X (1) [0,7]=X (1) [0,8]=X (1) [1,1]=X (1) [1,1]=NP > (1) [1,1]=PRP (1) [1,2]=X (1) [1,3]=X (1) [1,4]=X (1) [1,5]=X (1) > [1,6]=X (1) [1,7]=X (1) [1,7]=S (1) [1,8]=X (1) [2,2]=X (1) [2,2]=VBD > (1) [2,3]=X (1) [2,4]=X (1) [2,5]=X (1) [2,6]=X (1) [2,6]=VP (1) > [2,7]=X (1) [2,8]=X (1) [3,3]=X (1) [3,3]=DT (1) [3,4]=X (1) [3,5]=X > (1) [3,6]=X (1) [3,6]=NP (1) [3,7]=X (1) [3,8]=X (1) [4,4]=X (1) > [4,4]=JJ (1) [4,5]=X (1) [4,6]=X (1) [4,7]=X (1) [4,8]=X (1) [5,5]=X > (1) [5,5]=NN (1) [5,6]=X (1) [5,7]=X (1) [5,8]=X (1) [6,6]=X (1) > [6,6]=NN (1) [6,7]=X (1) [6,8]=X (1) [7,7]=X (1) [7,7]=. (1) [7,8]=X > (1) [8,8]=X (1) > > Num of hypo = 10065 --- cells: > 0 1 2 3 4 5 6 7 8 > 1 20 2 19 19 1 2 20 0 > 20 40 0 200 41 4 0 0 > 2 0 0 200 86 0 0 > 20 0 0 2 0 0 > 39 0 6 12 0 > 43 176 29 0 > 27 168 0 > 11 0 > 1 > BEST TRANSLATION: 10055 S </s> :0-0 : pC=0.000, c=-0.519 [0..8] 9016 > [total=-6.947] <<-5.646, 0.000, -17.078, -0.975, -11.623, -8.514, > -23.155, 4.000, 4.000>> > つまり 日本 の 水墨 画 を 一変 さ せ た 。 > > Trans Opt 0 [0..8]: [8..8]=</s> [0..7]=S : S ->S </s> :0-0 : pC=0, > c=-0.519086 -6.94668<<-5.64583, 0, -17.0777, -0.974503, -11.6227, > -8.51364, -23.1549, 3.99959, 3.99959>> > Trans Opt 0 [0..7]: [7..7]=X [0..6]=S : S ->S X :0-0 1-1 : > pC=0.999896, c=0.999896 -7.38058<<-5.21153, 0, -17.077, -0.974503, > -11.6227, -8.51364, -23.1549, 3.99959, 3.99959>> > Trans Opt 0 [0..6]: [3..6]=X [0..2]=S : S ->S X :0-0 1-1 : > pC=0.999896, c=0.999896 -8.60037<<-4.77724, 0, -17.0175, -0.9619, > -11.5975, -7.04926, -22.7336, 2.99969, 2.99969>> > Trans Opt 0 [0..2]: [2..2]=X [0..1]=S : S ->S X :0-0 1-1 : > pC=0.999896, c=0.999896 -3.05231<<-1.73718, 0, -5.89485, -0.9619, > -4.65396, -6.35611, -9.23707, 1.99979, 1.99979>> > Trans Opt 0 [0..1]: [1..1]=X [0..0]=S : S ->S X :0-0 1-1 : > pC=0.999896, c=0.999896 -2.31227<<-0.868589, 0, -4.07733, 0, 0, > -5.66296, -6.04736, 0.999896, 0.999896>> > Trans Opt 0 [0..0]: [0..0]=<s> : S -><s> :: pC=0, c=0.434294 > 0.434294<<-0.434294, 0, 0, 0, 0, 0, 0, 0, 0>> > Trans Opt 0 [1..1]: [1..1]=he : X ->つまり :: pC=-2.14208, c=-3.88913 > -3.88913<<-0.434294, 0, -4.36269, 0, 0, -5.66296, -6.04736, 0.999896, > 0>> > Trans Opt 0 [2..2]: [2..2]=revolutionized : X ->日本 の :: pC=-1.69976, > c=-2.8204 -2.8204<<-0.868589, 0, -3.97844, -0.9619, -4.65396, > -0.693147, -3.18971, 0.999896, 0>> > Trans Opt 0 [3..6]: [6..6]=painting [5..5]=ink [4..4]=japanese > [3..3]=the : X ->水墨 画 を 一変 さ せ た :: pC=-4.02668, c=-7.27479 > -7.27479<<-3.04006, 0, -12.5763, 0, -6.94358, -0.693147, -13.4966, > 0.999896, 0>> > Trans Opt 0 [7..7]: [7..7]=. : X ->。 :: pC=-0.184696, c=-0.720382 > -0.720382<<-0.434294, 0, -1.93996, -0.0126031, -0.025123, -1.46439, > -0.421262, 0.999896, 0>> > > I would expect the translation to use terminals other than "X" and > "S", but it seems to be not using syntactic labels at all (for any of > the sentences, not just the first). I've attached part of the rule table and > glue rule files, any clues about why it isn't using any syntactic > information in translation? > > Thanks in advance, > Graham > > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
