Hi, Is it possible that moses produces an over fitted model? How it can be checked? How it can be avoided?
------------------ Best Regards, S.Bakhshaei --- On Sun, 3/21/10, [email protected] <[email protected]> wrote: From: [email protected] <[email protected]> Subject: Moses-support Digest, Vol 41, Issue 29 To: [email protected] Date: Sunday, March 21, 2010, 7:32 PM Send Moses-support mailing list submissions to [email protected] To subscribe or unsubscribe via the World Wide Web, visit http://mailman.mit.edu/mailman/listinfo/moses-support or, via email, send a message with subject or body 'help' to [email protected] You can reach the person managing the list at [email protected] When replying, please edit your Subject line so it is more specific than "Re: Contents of Moses-support digest..." Today's Topics: 1. Re: Can't change reordering model (Philipp Koehn) 2. Re: different bleu scores from nist and moses scripts (Philipp Koehn) ---------------------------------------------------------------------- Message: 1 Date: Sat, 20 Mar 2010 16:19:36 +0000 From: Philipp Koehn <[email protected]> Subject: Re: [Moses-support] Can't change reordering model To: Sara Stymne <[email protected]> Cc: [email protected] Message-ID: <[email protected]> Content-Type: text/plain; charset=ISO-8859-1 Hi, thanks for pointing out the error - I fixed the web page. -phi On Fri, Mar 19, 2010 at 10:00 AM, Sara Stymne <[email protected]> wrote: > Hi Maria! > > The model msd-bidirectional-e will not work, since you can only > condition lexical reordering models on either the foreign phrase (f) or > both the foreign and the target (fe). Only on the target (e) is not > supported. > > There is actually an error on the Moses webpage > (http://www.statmt.org/moses/?n=FactoredTraining.BuildReorderingModel), > where it says that this e-model is acceptable. > > If you want to train unidirectional models, you should not use > -unidirectional in your command for the training script, it is chosen by > default. So change your second model to: msd-fe, at training time. > > At decoding time: if you're using a monotonicity model, make sure that > the number of weights specified in the moses.ini file, actually are the > same as in the reordering table. > > /Sara > > > 2010-03-18 21:57, maria sol ferrer skrev: >> Hey, thanks for your help. >> >> Does anyone else know what might be wrong with changing the reordering >> model? has anyone done it successfully? I really cant get it working and >> there's no help on the web. >> >> Thanks a lot. >> >> 2010/3/19 Hieu Hoang <[email protected] <mailto:[email protected]>> >> >> ? ? i can't see any obvious problems. >> >> ? ? i don't know much about the lexicalized reordering. It seems like >> ? ? the number of scores is not what is expected by the decoder for the >> ? ? particular setting specified in the ini file. I think in training, >> ? ? you have to specify what type of reordering model you want, and >> ? ? specify the SAME type when decoding. >> >> ? ? sorry i can't be of more help >> >> >> ? ? On 17 March 2010 21:35, maria sol ferrer <[email protected] >> ? ? <mailto:[email protected]>> wrote: >> >> ? ? ? ? Hey, thanks for your answer, here is the ini file and the output >> ? ? ? ? from when i try to translate... >> >> ? ? ? ? 2010/3/18 Hieu Hoang <[email protected] >> ? ? ? ? <mailto:[email protected]>> >> >> ? ? ? ? ? ? can you send me the ini file, i'll try & see if there's >> ? ? ? ? ? ? anything obviously wrong >> >> >> ? ? ? ? ? ? On 17/03/2010 05:09, maria sol ferrer wrote: >>> ? ? ? ? ? ? Hi, I was running some tests on different reordering types >>> ? ? ? ? ? ? and when I tried with >>> >>> ? ? ? ? ? ? msd-bidirectional-e, msd-unidirectional-fe and >>> ? ? ? ? ? ? monotonicity-bidirectional-fe >>> >>> ? ? ? ? ? ? the first 2 gave the error that reordering doesnt exist >>> ? ? ? ? ? ? during training, the third trained fine but when trying to >>> ? ? ? ? ? ? translate, it didnt, the only message there was was: >>> ? ? ? ? ? ? Reordering types NOT individualy weighted! (By the way, >>> ? ? ? ? ? ? the default reordering works fine) >>> >>> ? ? ? ? ? ? I searched the web and all I found was another question on >>> ? ? ? ? ? ? the mailing list like this, about the monotonicity >>> ? ? ? ? ? ? reordering (all of them) not working, but it was never >>> ? ? ? ? ? ? answered. >>> >>> ? ? ? ? ? ? I'm running an english-spanish decoder and the reordering >>> ? ? ? ? ? ? is not so good, it just pretty much translates in the >>> ? ? ? ? ? ? english order, which is not always correct in spanish. >>> >>> ? ? ? ? ? ? I'd apreciate your help, thanks. >>> >>> >>> ? ? ? ? ? ? _______________________________________________ >>> ? ? ? ? ? ? Moses-support mailing list >>> ? ? ? ? ? ? [email protected] ?<mailto:[email protected]> >>> ? ? ? ? ? ? http://mailman.mit.edu/mailman/listinfo/moses-support >>> >> >> >> >> >> >> >> _______________________________________________ >> Moses-support mailing list >> [email protected] >> http://mailman.mit.edu/mailman/listinfo/moses-support > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > ------------------------------ Message: 2 Date: Sat, 20 Mar 2010 16:28:33 +0000 From: Philipp Koehn <[email protected]> Subject: Re: [Moses-support] different bleu scores from nist and moses scripts To: Adam Lopez <[email protected]> Cc: [email protected] Message-ID: <[email protected]> Content-Type: text/plain; charset=ISO-8859-1 Hi, the NIST script does internal tokenization, while the multi-bleu script assumes that the data is already tokenized. There is also a difference with the brevity penalty in the case of multiple reference translations. -phi On Fri, Mar 19, 2010 at 7:49 PM, Adam Lopez <[email protected]> wrote: > IIRC, the principle difference is the calculation of the brevity > penalty, but there also seem to be some slight differences in > tokenization between the scripts. > > On Fri, Mar 19, 2010 at 9:32 AM, Mark Fishel <[email protected]> wrote: >> Dear list, >> >> I am getting different BLEU scores from the NIST mteval script >> (version) and the multi-bleu.perl script within Moses's distribution >> for the same reference and hypothesis translations -- even the >> individual n-gram precisions are different: >> >> BLEU = 16.80, 53.0/26.2/13.4/6.4 (BP=0.905, ratio=0.909, hyp_len=281, >> ref_len=309) >> >> and >> >> BLEU score = 0.1681 for system "x" >> >> Individual N-gram scoring >> ? ? ? ?1-gram ? 2-gram ? 3-gram ? 4-gram ? 5-gram ? 6-gram ? 7-gram >> 8-gram ? 9-gram >> ? ? ? ?------ ? ------ ? ------ ? ------ ? ------ ? ------ ? ------ >> ------ ? ------ >> ?BLEU: ?0.5246 ? 0.2591 ? 0.1326 ? 0.0630 ? 0.0328 ? 0.0213 ? 0.0133 >> 0.0046 ? 0.0000 ?"x" >> >> The files that produced the scores are here: mtj.ut.ee/diffbleu.tgz . >> >> Does everyone else get different scores? Can anyone suggest a reason >> for that? It's not the smoothing of the NIST script, both support UTF8 >> i/o, etc; so I'm out of ideas, and before comparing the >> implementations I wanted to ask for opinions. >> >> Thanks in advance, >> Mark >> _______________________________________________ >> Moses-support mailing list >> [email protected] >> http://mailman.mit.edu/mailman/listinfo/moses-support >> > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > > -- > The University of Edinburgh is a charitable body, registered in > Scotland, with registration number SC005336. > > ------------------------------ _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support End of Moses-support Digest, Vol 41, Issue 29 *********************************************
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
