Re: [Moses-support] Moses-support Digest, Vol 56, Issue 8

Philipp Koehn Fri, 10 Jun 2011 11:12:55 -0700

Hi,

XML markup may be used for reordering constraints, but that is different
from treating a word sequence as a unit with respect to the translation
model.


-phi

On Fri, Jun 10, 2011 at 7:02 PM, Somayeh Bakhshaei <[email protected]>wrote:

> Dear Dr. Koehn,
>
> for grouping the input tokens (like "the man"), isn't there any solution by
> the help of XML tags?
>
> ------------------
> Best Regards,
> S.Bakhshaei
>
> --- On *Fri, 6/10/11, [email protected] <
> [email protected]>* wrote:
>
>
> From: [email protected] <[email protected]>
> Subject: Moses-support Digest, Vol 56, Issue 8
> To: [email protected]
> Date: Friday, June 10, 2011, 8:43 PM
>
> Send Moses-support mailing list submissions to
>     [email protected] <http://mc/[email protected]>
>
> To subscribe or unsubscribe via the World Wide Web, visit
>     http://mailman.mit.edu/mailman/listinfo/moses-support
> or, via email, send a message with subject or body 'help' to
>     
> [email protected]<http://mc/[email protected]>
>
> You can reach the person managing the list at
>     
> [email protected]<http://mc/[email protected]>
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Moses-support digest..."
>
>
> Today's Topics:
>
>    1. How to change phrase representation (Anna c)
>    2. Re: How to change phrase representation (Philipp Koehn)
>    3. FW:  How to change phrase representation (Anna c)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Fri, 10 Jun 2011 11:38:34 +0200
> From: Anna c <[email protected]<http://mc/[email protected]>
> >
> Subject: [Moses-support] How to change phrase representation
> To: <[email protected] <http://mc/[email protected]>>
> Message-ID: 
> <[email protected]<http://mc/[email protected]>
> >
> Content-Type: text/plain; charset="iso-8859-1"
>
>
> Hi!
> I'm doing a master's degree and I need some help with one of my subjects.
> I've already installed GIZA++ and Moses correctly, and made the step by step
> guide of the web, checking that everything was ok. But I'm a newbie in this
> and I'm a bit lost. What I have to do is to change the representation so the
> basic unit won't be the word, but pairs or triplets of words, and compare it
> with the normal representation. How do I do that? Do I have to change the
> preparation step in the training?
>
> Thank you very much!
> Best regards,
> Anna
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> http://mailman.mit.edu/mailman/private/moses-support/attachments/20110610/2429053c/attachment-0001.htm
>
> ------------------------------
>
> Message: 2
> Date: Fri, 10 Jun 2011 10:48:07 +0100
> From: Philipp Koehn 
> <[email protected]<http://mc/[email protected]>
> >
> Subject: Re: [Moses-support] How to change phrase representation
> To: Anna c <[email protected]<http://mc/[email protected]>
> >
> Cc: [email protected] <http://mc/[email protected]>
> Message-ID: 
> <[email protected]<http://mc/[email protected]>
> >
> Content-Type: text/plain; charset=ISO-8859-1
>
> Hi,
>
> I am not entirely sure if I fully understand your question,
> but let me try to answer.
>
> the phrase-based model implementation considers tokens
> separated by a white space as a word. It does also learn
> translation entries for sequences of words ("phrases").
>
> If you want to group words into larger tokens, then you
> have to replace the white spaces.
>
> For instance, if you want to force the training setup and decoder
> to treat "the man" as a unit, then you should replace all
> occurrences (in training data and decoder input) with "the~man".
>
> -phi
>
> On Fri, Jun 10, 2011 at 10:38 AM, Anna c 
> <[email protected]<http://mc/[email protected]>>
> wrote:
> > Hi!
> > I'm doing a master's degree and I need some help with one of my subjects.
> > I've already installed GIZA++ and Moses correctly, and made the step by
> step
> > guide of the web, checking that everything was ok. But I'm a newbie in
> this
> > and I'm a bit lost. What I have to do is to change the representation so
> the
> > basic unit won't be the word, but pairs or triplets of words, and compare
> it
> > with the normal representation. How do I do that? Do I have to change the
> > preparation step in the training?
> >
> > Thank you very much!
> > Best regards,
> > Anna
> >
> > _______________________________________________
> > Moses-support mailing list
> > [email protected] <http://mc/[email protected]>
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >
> >
>
>
> ------------------------------
>
> Message: 3
> Date: Fri, 10 Jun 2011 17:37:39 +0200
> From: Anna c <[email protected]<http://mc/[email protected]>
> >
> Subject: [Moses-support] FW:  How to change phrase representation
> To: <[email protected] <http://mc/[email protected]>>, <
> [email protected] <http://mc/[email protected]>>
> Message-ID: 
> <[email protected]<http://mc/[email protected]>
> >
> Content-Type: text/plain; charset="iso-8859-1"
>
>
> I think it would be that. I'm gonna try it. Thank you very much!
>
> And, if it's not too much trouble, I've another question..... I only have
> two sets, training and test (which I've split into four: training.es,
> training.en, test.es, test.en, as the originals had both languages in the
> same line). The training part hasn't got any problem, but as I see on the
> guide, I must use different sets in tuning (in the example, dev/
> nc-dev2007.fr or dev/nc-dev2007.en) and evaluation (devtest/nc-test2007.fr,
> nc-test2007-ref.en.sgm, nc-test2007-src.fr.sgm). Should I use the same set
> in all the steps? I mean, when the example uses a file .fr, I use my
> test.es and when is .en, my test.en. Or should I use a different part of
> it in each step?
>
> Again, thank you very much!
> Anna
>
>
> > Date: Fri, 10 Jun 2011 10:48:07 +0100
> > Subject: Re: [Moses-support] How to change phrase representation
> > From: [email protected] <http://mc/[email protected]>
> > To: [email protected] <http://mc/[email protected]>
> > CC: [email protected] <http://mc/[email protected]>
> >
> > Hi,
> >
> > I am not entirely sure if I fully understand your question,
> > but let me try to answer.
> >
> > the phrase-based model implementation considers tokens
> > separated by a white space as a word. It does also learn
> > translation entries for sequences of words ("phrases").
> >
> > If you want to group words into larger tokens, then you
> > have to replace the white spaces.
> >
> > For instance, if you want to force the training setup and decoder
> > to treat "the man" as a unit, then you should replace all
> > occurrences (in training data and decoder input) with "the~man".
> >
> > -phi
> >
> > On Fri, Jun 10, 2011 at 10:38 AM, Anna c 
> > <[email protected]<http://mc/[email protected]>>
> wrote:
> > > Hi!
> > > I'm doing a master's degree and I need some help with one of my
> subjects.
> > > I've already installed GIZA++ and Moses correctly, and made the step by
> step
> > > guide of the web, checking that everything was ok. But I'm a newbie in
> this
> > > and I'm a bit lost. What I have to do is to change the representation
> so the
> > > basic unit won't be the word, but pairs or triplets of words, and
> compare it
> > > with the normal representation. How do I do that? Do I have to change
> the
> > > preparation step in the training?
> > >
> > > Thank you very much!
> > > Best regards,
> > > Anna
> > >
> > > _______________________________________________
> > > Moses-support mailing list
> > > [email protected] <http://mc/[email protected]>
> > > http://mailman.mit.edu/mailman/listinfo/moses-support
> > >
> > >
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> http://mailman.mit.edu/mailman/private/moses-support/attachments/20110610/34db6400/attachment-0001.htm
>
> ------------------------------
>
> _______________________________________________
> Moses-support mailing list
> [email protected] <http://mc/[email protected]>
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
> End of Moses-support Digest, Vol 56, Issue 8
> ********************************************
>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Moses-support Digest, Vol 56, Issue 8

Reply via email to