Hi,

we saw slightly better results with very large tuning sets here:
http://aclweb.org/anthology-new/W/W12/W12-3139.pdf

-phi


On Mon, Apr 22, 2013 at 5:38 PM, Chris Dyer <[email protected]> wrote:

> The JHU summer workshop final report had some experiments on this:
>
> http://www.learningace.com/doc/3098660/be148017730f3f3a7b45d656276b482a/jhu-summer-workshop-final-report
> (See Fig. 6.7 and surrounding)
>
> In general:
> 1) MERT works on so few features that you don't need much dev data to
> learn them
> 2) Dev data selection is more important than dev data size (i.e.,
> length, number of references). See, e.g., the thesis of Nitin Madnani
> (2010) on the value of multiple references. This is especially true if
> you're going to evaluate your system on a small test set.
> 3) The more features you have, the more data you need. This is a
> serious limitation of most current discriminative training work, which
> focus on adding new features without (usually) rethinking how dev sets
> are used.
>
>
>
> On Mon, Apr 22, 2013 at 9:56 AM, Sara Stymne <[email protected]>
> wrote:
> > HI,
> >
> > Does anyone know of any published results which invesitage the effect of
> > the size of the tuning data set. I'm primarily interested in relation to
> > Mert, but other optimization methods would also be interesting,
> >
> > Best,
> > Sara
> > _______________________________________________
> > Moses-support mailing list
> > [email protected]
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to