hello,

my question is regarding moses.ini, if we uses IRSTLM  should we
replace the KENLM by IRSTLM in moses.ini

thanks

On Thu, Nov 26, 2015 at 6:00 PM, <[email protected]> wrote:

> Send Moses-support mailing list submissions to
>         [email protected]
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         http://mailman.mit.edu/mailman/listinfo/moses-support
> or, via email, send a message with subject or body 'help' to
>         [email protected]
>
> You can reach the person managing the list at
>         [email protected]
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Moses-support digest..."
>
>
> Today's Topics:
>
>    1. Re: Language model question (Dingyuan Wang)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Fri, 27 Nov 2015 00:05:51 +0800
> From: Dingyuan Wang <[email protected]>
> Subject: Re: [Moses-support] Language model question
> To: Vincent Nguyen <[email protected]>
> Cc: moses-support <[email protected]>
> Message-ID:
>         <
> caft8h74h6ta+ijkc_chao2dvuchqnonvk64q+jdn99jk5b-...@mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Hi,
>
> I tend to fix it in the tokenization script, or I would solve this in some
> pre-processing scripts if there are any obvious patterns in the noise.
>
> --
> Dingyuan
> 2015?11?26? 21:09? "Vincent Nguyen" <[email protected]>???
>
> > Hi all,
> >
> > I have a question regarding LMs.
> >
> > Let's take the example of news.2014.shuffle.en
> >
> > When we process it through punctuation normalization for english
> > language, it will for instance put a " " before an apostrophe
> > "it is'nt" = > "it is 'nt"
> >
> > BUT it contains some noise, for instance there is some french sentences
> > in the corpus, for which the apostrophe process will not be suited
> > "j'aime" => "j 'aime" => it will create the token 'aime
> >
> > My point is the following,
> >
> > At stage of LM building, how can we prune to eliminate such token like
> > "'aime" so that it does not create wrong uni-grams, nor bi-grams, ...
> >
> > the ngram -minprune only take 2 as a minimum so wrong unigrams will
> > still be taken in the LM.
> >
> >
> > Hope I'm clear enough ....
> >
> > Vincent
> > _______________________________________________
> > Moses-support mailing list
> > [email protected]
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> http://mailman.mit.edu/mailman/private/moses-support/attachments/20151126/e6c989a0/attachment-0001.html
>
> ------------------------------
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
> End of Moses-support Digest, Vol 109, Issue 70
> **********************************************
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to