Hi Per We typically use tuning sets of 1000-3000 sentences, but recently have been experimenting with larger sets (10k) which can give slightly better results. It all depends if you care about that last 0.2 bleu. I don't think there's been any thorough investigation into tuning set size, or its relation with training set size.
batch-mira works well, sometimes better than mert, but not quicker. The only reading is the Cherry and Foster paper, which contains a good overview of tuning methods. I should also mention this presentation on discriminative training http://www.statmt.org/mtm12/pub/discriminative-mt.pdf cheers - Barry On 15/03/13 12:10, Per Tunedal wrote: > Hi Barry, > I've already looked at that page, but it didn't answer my questions. > > The most pertinent questions are practical: > What's the recommended size of the tuning corpus? > Is that size independent of the size of the training corpus, or not? > > But, I'm interested in the theoretical aspects as well. > > I've looked into the mert-moses.pl script: > maximum-iterations=i : could be a short cut if I don't want to wait for > ever. Any advice on a wise limit for the iterations? > threads=i : sounds useful. But you say that I probably wont need it. > Why? > > Any experience of batch-mira? pros and cons? Any reading? > > Yours, > Per Tunedal > > On Fri, Mar 15, 2013, at 10:50, Barry Haddow wrote: >> Hi Per >> >> There's a lot of questions in this email. I'd strongly recommend that >> you have a look at this page >> http://www.statmt.org/moses/?n=FactoredTraining.Tuning and the >> references in it. But if you really want to understand tuning you need >> to read this book (http://www.statmt.org/book/) and particularly chapter >> 9. >> >> As to the memory/thread usage, Moses will use a single thread whilst >> loading models, then multiple threads in decoding. The mert binary >> (mert) shouldn't be resource heavy in the default setting. It has its >> own threads parameter, but you probably don't need it. >> >> Tuning stops when it no longer gets any improvement, typically 10-20 >> iterations, although there is an upper limit of 25 (configurable). >> >> cheers - Barry >> >> On 15/03/13 08:08, Per Tunedal wrote: >>> Hi again, >>> What does the tuning actually do? Tries to translate and checks against >>> the actual translation in the target language file? Trying different >>> weights, over and over again? No wonder it's time consuming. >>> >>> Tuning needs a lot of memory too, compared to training. At least in one >>> of the steps, according to the system monitor. The step that only uses >>> one thread, in spite of the parameter -threads. What step? And why? >>> >>> I see some interesting files are created, with names like >>> run8.best100.out . I suppose those are the most successful translations. >>> How are they used in the tuning? >>> >>> The default tuner (?) is mert, how does mert acually work to do the >>> tuning efficient? How are the weights to be tested chosen? Are there >>> any short cuts to take? >>> What's the difference to other tuners (?)? >>> >>> Anyone working on some different approach for tuning, to get improved >>> tuning speed or improved translation quality? >>> >>> What's the recommended size of the tuning corpus? Is that size >>> independent of the size of the training corpus? Is it dependent of the >>> tuner (?) used? >>> >>> Yours, >>> Per Tunedal >>> >>> PS My tuning has just started round 8, after 20 hours of processing. >>> Will it stop at 10 rounds, or what? >>> >>> >>> _______________________________________________ >>> Moses-support mailing list >>> [email protected] >>> http://mailman.mit.edu/mailman/listinfo/moses-support >>> > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
