Hi, we saw slightly better results with very large tuning sets here: http://aclweb.org/anthology-new/W/W12/W12-3139.pdf
-phi On Mon, Apr 22, 2013 at 5:38 PM, Chris Dyer <[email protected]> wrote: > The JHU summer workshop final report had some experiments on this: > > http://www.learningace.com/doc/3098660/be148017730f3f3a7b45d656276b482a/jhu-summer-workshop-final-report > (See Fig. 6.7 and surrounding) > > In general: > 1) MERT works on so few features that you don't need much dev data to > learn them > 2) Dev data selection is more important than dev data size (i.e., > length, number of references). See, e.g., the thesis of Nitin Madnani > (2010) on the value of multiple references. This is especially true if > you're going to evaluate your system on a small test set. > 3) The more features you have, the more data you need. This is a > serious limitation of most current discriminative training work, which > focus on adding new features without (usually) rethinking how dev sets > are used. > > > > On Mon, Apr 22, 2013 at 9:56 AM, Sara Stymne <[email protected]> > wrote: > > HI, > > > > Does anyone know of any published results which invesitage the effect of > > the size of the tuning data set. I'm primarily interested in relation to > > Mert, but other optimization methods would also be interesting, > > > > Best, > > Sara > > _______________________________________________ > > Moses-support mailing list > > [email protected] > > http://mailman.mit.edu/mailman/listinfo/moses-support > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support >
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
