Hi Neda Yes, the variation you observe is normal.
If you have the features in your system that you described in your original post ( LM: 3gram, Reordering, msd-bidirectional-fe, Alignment: grow-diag-final- and, phrase) then you'll have 14 features. I quickly scanned the article and it refers to using a 'baseline Phrase-based system (Moses)' - unfortunately I couldn't see any more details - but this is likely to have the same features as yours, cheers - Barry On Monday 31 Oct 2011 11:32:45 Neda NoorMohammadi wrote: > Thanks your answer in advance, > > I read your article and I am running my system according to your offer. > But, note that my experiment is related to a system with only 6 features. I > think, this result is normal for a system with many features. > Results using BLEU in 2 times run system are: 25.27% , 0.2407. (There is a > gap between results- more than 1 percent in some cases) > > Is this result normal yet? > > Neda Noormohammadi > > On Mon, Oct 31, 2011 at 12:15 PM, Barry Haddow > > <[email protected]>wrote: > > Hi Neda > > > > If you're interested in determining whether translation method A is > > better than method B, or not, then you shouldn't use predictable-seed. > > You should run > > mert several times and take the average score. Because if you run with > > predictable seeds then all you are verifying is whether A beats B using > > the rather arbitrary seed settings from deterministic mert, nothing more. > > > > I'd suggest that you read the paper I linked - it explains these points > > and more, > > > > cheers - Barry > > > > On Monday 31 Oct 2011 08:35:01 Nicola Bertoldi wrote: > > > Hi Neda > > > > > > There is also a parameter of mert-moses.pl "--predictable-seed" > > > (see the help) which makes MERT deterministic. > > > > > > In general, MERT procedure relies on an initial seed seed_t at > > > each iteration t for the creation of the (20 by default) random > > > starting points for the optimization; not that seed_i differs from > > > seed_j > > > > (if > > > > > i differs from j) > > > > > > If you activate this flag "--predictable-seed", seed_i still differs > > > > from > > > > > seed_j BUT the sequence seed_1, seed_2, ... seed_N is always the > > > same; and hence, the random starting points considered for the > > > optimization > > > > are > > > > > always the same, so that the final results of the MERT is always the > > > same. > > > > > > best regards > > > Nicola > > > > > > On Oct 29, 2011, at 6:37 PM, Patrik Lambert wrote: > > > > Hi Neda, > > > > > > > > this happens because the seed used in the MERT optimizer depends by > > > > default on the moment on which you launch it. > > > > If you need deterministic MERT runs, you can set the seed by adding > > > > this > > > > > > switch to the mert-moses.pl call: > > > > > > > > --mertargs=" -r $seed " > > > > > > > > Patrik > > > > > > > > Le 29/10/2011 18:11, [email protected] a écrit : > > > >> Message: 3 > > > >> Date: Sat, 29 Oct 2011 17:05:11 +0100 > > > >> From: Barry Haddow<[email protected]> > > > >> Subject: Re: [Moses-support] mert-moses.pl script > > > >> To: [email protected] > > > >> Message-ID:<[email protected]> > > > >> Content-Type: Text/Plain; charset="utf-8" > > > >> > > > >> Hi Neda > > > >> > > > >> Yes, this is quite normal. The best plan is to do several runs and > > > > take > > > > > >> the average bleu. See this paper for a discussion > > > >> > > > >> http://www.cs.cmu.edu/~jhclark/pubs/significance.pdf > > > >> > > > >> cheers - Barry > > > >> > > > >> On Saturday 29 Oct 2011 10:09:29 Neda NoorMohammadi wrote: > > > >>> hello, > > > >>> > > > >>> In my experiment this is happening: > > > >>> > > > >>> I am running mert-moses.pl script on a same develop set and > > > > configure > > > > > >>> file in a same condition but by each execution I gain different > > > > weights > > > > > >>> which leads to different Bleu (about 1%) (note that these > > > >>> executions are independent of each other and are initialized to > > > >>> predefine > > > > weights > > > > > >>> of moses.ini). > > > >>> > > > >>> The used features in configure file are: LM: 3gram, Reordering: > > > >>> msd-bidirectional-fe, Alignment: grow-diag-final-and, phrase > > > >>> scores. > > > >>> > > > >>> Is it normal? What is your suggestion? > > > >>> In this situation what is my baseline? The best one or the worth? > > > >>> > > > >>> Neda Noormohammadi > > > >> > > > >> ------------------------------ > > > >> > > > >> _______________________________________________ > > > >> Moses-support mailing list > > > >> [email protected] > > > >> http://mailman.mit.edu/mailman/listinfo/moses-support > > > >> > > > >> > > > >> End of Moses-support Digest, Vol 60, Issue 46 > > > >> ********************************************* > > > > > > > > _______________________________________________ > > > > Moses-support mailing list > > > > [email protected] > > > > http://mailman.mit.edu/mailman/listinfo/moses-support > > > > > > _______________________________________________ > > > Moses-support mailing list > > > [email protected] > > > http://mailman.mit.edu/mailman/listinfo/moses-support > > > > _______________________________________________ > > Moses-support mailing list > > [email protected] > > http://mailman.mit.edu/mailman/listinfo/moses-support > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
