Re: [Moses-support] tree-to-string training

Hieu Hoang Mon, 24 Oct 2011 06:49:46 -0700

another thing you can do is to project the source non-term to the target
side. For example, if the rule is
   a [B][X] [C] ||| d e [B][X] f g h [X] ||| 1-2
then change it to
   a [B][B] [C] ||| d e [B][B] f g h [C] ||| 1-2


this is what most people would describe as tree-to-string in the literature.

It will
  1. constrain the translation rules with an input parse, AND
  2. ensure that the non-terms are consistently labelled
It will then show up the the trace with the labelled non-term, rather than
just 'X'

On 24 October 2011 20:28, Graham Neubig <[email protected]>wrote:

> Hi Hieu,
>
> Thanks! And sorry, I was confused. I thought that the trace file would
> show the source syntax symbols, but now I see that it's only
> displaying the target symbols. I'll try out your other suggestions as
> well and see if they give better results.
>
> Graham
>
> On Mon, Oct 24, 2011 at 3:46 PM, Hieu Hoang <[email protected]> wrote:
> > hi graham,
> >
> > everything seems to be ok with the decoding. The decoding is using rules
> > other than glue rules. For instance, the range [3..6] is translated by a
> > rule from your translation table.
> >
> > However, it is true that no consecutive rules are joined by anything
> other
> > than the glue rule. That's often the case with syntax decoding. Also,
> you've
> > specified
> >    --NonTermConsecSource
> > so you'll never get a rule like
> >    VP --> [VB,1] [NP,2] #   X --> [X,1] [X,2]
> > I think the argument is unnecessary for tree-to-string decoding and
> limits
> > some good translation rules that may be extracted.
> >
> > Also, the decoding algorithm in Moses separates the constraint of
> matching
> > the non-terminal labels between the translation rule and the input parse,
> > and the translation rule and it's child hypothesis.
> >
> > The tree-to-string as you have it only matches the non-term of the rule
> and
> > the input parse, the 'target' non-terminals are always X.
> >
> > The trace file only shows the non-terminal label of the 'target' side so
> you
> > don't see the source non-term.
> >
> > not sure if that makes sense
> >
> > On 23/10/2011 12:18, Graham Neubig wrote:
> >
> > Hi,
> >
> > I'm trying to train a tree-to-string model and have gotten gotten
> > train-model.perl to run properly, but it seems that when I perform
> > decoding the decoder is using nothing but the glue rules. My
> > train-model.perl command is as follows:
> >
> > $SCRIPTS_ROOTDIR/training/train-model.perl -scripts-root-dir
> > $SCRIPTS_ROOTDIR -root-dir work -corpus moses-0128.ja-en -f en -e ja
> > -alignment grow-diag-final-and -lm 0:3:`pwd`/ja-ja.arpa -source-syntax
> > -glue-grammar -extract-options "--MinWords 0 --MinHoleSource 1
> > --MinWords 0 --NonTermConsecSource" &> err.txt
> >
> > Using this model, a decoded looks like this in Moses's output and the
> > trace file:
> >
> > Translating: <s> he revolutionized the japanese ink painting . </s>
> > ||| [0,0]=X (1) [0,1]=X (1) [0,2]=X (1) [0,3]=X (1) [0,4]=X (1)
> > [0,5]=X (1) [0,6]=X (1) [0,7]=X (1) [0,8]=X (1) [1,1]=X (1) [1,1]=NP
> > (1) [1,1]=PRP (1) [1,2]=X (1) [1,3]=X (1) [1,4]=X (1) [1,5]=X (1)
> > [1,6]=X (1) [1,7]=X (1) [1,7]=S (1) [1,8]=X (1) [2,2]=X (1) [2,2]=VBD
> > (1) [2,3]=X (1) [2,4]=X (1) [2,5]=X (1) [2,6]=X (1) [2,6]=VP (1)
> > [2,7]=X (1) [2,8]=X (1) [3,3]=X (1) [3,3]=DT (1) [3,4]=X (1) [3,5]=X
> > (1) [3,6]=X (1) [3,6]=NP (1) [3,7]=X (1) [3,8]=X (1) [4,4]=X (1)
> > [4,4]=JJ (1) [4,5]=X (1) [4,6]=X (1) [4,7]=X (1) [4,8]=X (1) [5,5]=X
> > (1) [5,5]=NN (1) [5,6]=X (1) [5,7]=X (1) [5,8]=X (1) [6,6]=X (1)
> > [6,6]=NN (1) [6,7]=X (1) [6,8]=X (1) [7,7]=X (1) [7,7]=. (1) [7,8]=X
> > (1) [8,8]=X (1)
> >
> > Num of hypo = 10065 --- cells:
> >  0   1   2   3   4   5   6   7   8
> >  1  20   2  19  19   1   2  20   0
> >   20  40   0 200  41   4   0   0
> >      2   0   0 200  86   0   0
> >       20   0   0   2   0   0
> >         39   0   6  12   0
> >           43 176  29   0
> >             27 168   0
> >               11   0
> >                  1
> > BEST TRANSLATION: 10055 S </s> :0-0 : pC=0.000, c=-0.519 [0..8] 9016
> > [total=-6.947] <<-5.646, 0.000, -17.078, -0.975, -11.623, -8.514,
> > -23.155, 4.000, 4.000>>
> > つまり 日本 の 水墨 画 を 一変 さ せ た 。
> >
> > Trans Opt 0 [0..8]: [8..8]=</s>   [0..7]=S  : S ->S </s> :0-0 : pC=0,
> > c=-0.519086 -6.94668<<-5.64583, 0, -17.0777, -0.974503, -11.6227,
> > -8.51364, -23.1549, 3.99959, 3.99959>>
> > Trans Opt 0 [0..7]: [7..7]=X   [0..6]=S  : S ->S X :0-0 1-1 :
> > pC=0.999896, c=0.999896 -7.38058<<-5.21153, 0, -17.077, -0.974503,
> > -11.6227, -8.51364, -23.1549, 3.99959, 3.99959>>
> > Trans Opt 0 [0..6]: [3..6]=X   [0..2]=S  : S ->S X :0-0 1-1 :
> > pC=0.999896, c=0.999896 -8.60037<<-4.77724, 0, -17.0175, -0.9619,
> > -11.5975, -7.04926, -22.7336, 2.99969, 2.99969>>
> > Trans Opt 0 [0..2]: [2..2]=X   [0..1]=S  : S ->S X :0-0 1-1 :
> > pC=0.999896, c=0.999896 -3.05231<<-1.73718, 0, -5.89485, -0.9619,
> > -4.65396, -6.35611, -9.23707, 1.99979, 1.99979>>
> > Trans Opt 0 [0..1]: [1..1]=X   [0..0]=S  : S ->S X :0-0 1-1 :
> > pC=0.999896, c=0.999896 -2.31227<<-0.868589, 0, -4.07733, 0, 0,
> > -5.66296, -6.04736, 0.999896, 0.999896>>
> > Trans Opt 0 [0..0]: [0..0]=<s>  : S -><s> :: pC=0, c=0.434294
> > 0.434294<<-0.434294, 0, 0, 0, 0, 0, 0, 0, 0>>
> > Trans Opt 0 [1..1]: [1..1]=he  : X ->つまり :: pC=-2.14208, c=-3.88913
> > -3.88913<<-0.434294, 0, -4.36269, 0, 0, -5.66296, -6.04736, 0.999896,
> > 0>>
> > Trans Opt 0 [2..2]: [2..2]=revolutionized  : X ->日本 の :: pC=-1.69976,
> > c=-2.8204 -2.8204<<-0.868589, 0, -3.97844, -0.9619, -4.65396,
> > -0.693147, -3.18971, 0.999896, 0>>
> > Trans Opt 0 [3..6]: [6..6]=painting   [5..5]=ink   [4..4]=japanese
> > [3..3]=the  : X ->水墨 画 を 一変 さ せ た :: pC=-4.02668, c=-7.27479
> > -7.27479<<-3.04006, 0, -12.5763, 0, -6.94358, -0.693147, -13.4966,
> > 0.999896, 0>>
> > Trans Opt 0 [7..7]: [7..7]=.  : X ->。 :: pC=-0.184696, c=-0.720382
> > -0.720382<<-0.434294, 0, -1.93996, -0.0126031, -0.025123, -1.46439,
> > -0.421262, 0.999896, 0>>
> >
> > I would expect the translation to use terminals other than "X" and
> > "S", but it seems to be not using syntactic labels at all (for any of
> > the sentences, not just the first). I've attached part of the rule table
> and
> > glue rule files, any clues about why it isn't using any syntactic
> > information in translation?
> >
> > Thanks in advance,
> > Graham
> >
> >
> > _______________________________________________
> > Moses-support mailing list
> > [email protected]
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >
> > _______________________________________________
> > Moses-support mailing list
> > [email protected]
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >
> >
>

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] tree-to-string training

Reply via email to