Hi John,
Thank you for your answer. That's really useful information, as I don't
want to wait unnecessary long for the tuning to complete.
I suspected that the corpus size for tuning wasn't dependent on the size
of the training corpus. Now I've got an idea how large a size is
reasonable.
Yours,
Per Tunedal

On Fri, Mar 15, 2013, at 16:19, John D. Burger wrote:
> We did some experiments a long time ago on tuning set size (for Chinese
> to English).  For the standard Moses setup, there are only a dozen or so
> meta-features to find weights for, so it's no surprise that improvements
> asymptote sharply after the tuning set gets much bigger than 1-2000
> segment pairs.  (To answer one of your questions, Per, the size of the
> tuning set shouldn't have much, if anything, to do with the size of the
> phrase training dataset.)
> 
> Of course tuning algorithms like MIRA let you efficiently work with many
> more meta-features - see Chiang et al. 2009:
> 
>   http://www.aclweb.org/anthology/N/N09/N09-1025.pdf
> 
> In this case you'd expect to continue finding improvements with much
> larger tuning sets.
> 
> - John Burger
>   MITRE
> 
> On Mar 15, 2013, at 11:07 , Barry Haddow wrote:
> 
> > Hi Per
> > 
> > We typically use tuning sets of 1000-3000 sentences, but recently have 
> > been experimenting with larger sets (10k) which can give slightly better 
> > results. It all depends if you care about that last 0.2 bleu. I don't 
> > think there's been any thorough investigation into tuning set size, or 
> > its relation with training set size.
> > 
> > batch-mira works well, sometimes better than mert, but not quicker. The 
> > only reading is the Cherry and Foster paper, which contains a good 
> > overview of tuning methods.
> > 
> > I should also mention this presentation on discriminative training
> > http://www.statmt.org/mtm12/pub/discriminative-mt.pdf
> > 
> > cheers - Barry
> > 
> > On 15/03/13 12:10, Per Tunedal wrote:
> >> Hi Barry,
> >> I've already looked at that page, but it didn't answer my questions.
> >> 
> >> The most pertinent questions are practical:
> >> What's the recommended size of the tuning corpus?
> >> Is that size independent of the size of the training corpus, or not?
> >> 
> >> But, I'm interested in the theoretical aspects as well.
> >> 
> >> I've looked into the mert-moses.pl script:
> >> maximum-iterations=i : could be a short cut if I don't want to wait for
> >> ever. Any advice on a wise limit for the iterations?
> >> threads=i : sounds useful. But you say that I probably wont need it.
> >> Why?
> >> 
> >> Any experience of batch-mira? pros and cons? Any reading?
> >> 
> >> Yours,
> >> Per Tunedal
> >> 
> >> On Fri, Mar 15, 2013, at 10:50, Barry Haddow wrote:
> >>> Hi Per
> >>> 
> >>> There's a lot of questions in this email. I'd strongly recommend that
> >>> you have a look at this page
> >>> http://www.statmt.org/moses/?n=FactoredTraining.Tuning and the
> >>> references in it. But if you really want to understand tuning you need
> >>> to read this book (http://www.statmt.org/book/) and particularly chapter
> >>> 9.
> >>> 
> >>> As to the memory/thread usage, Moses will use a single thread whilst
> >>> loading models, then multiple threads in decoding. The mert binary
> >>> (mert) shouldn't be resource heavy in the default setting. It has its
> >>> own threads parameter, but you probably don't need it.
> >>> 
> >>> Tuning stops when it no longer gets any improvement, typically 10-20
> >>> iterations, although there is an upper limit of 25 (configurable).
> >>> 
> >>> cheers - Barry
> >>> 
> >>> On 15/03/13 08:08, Per Tunedal wrote:
> >>>> Hi again,
> >>>> What does the tuning actually do? Tries to translate and checks against
> >>>> the actual translation in the target language file? Trying different
> >>>> weights, over and over again? No wonder it's time consuming.
> >>>> 
> >>>> Tuning needs a lot of memory too, compared to training. At least in one
> >>>> of the steps, according to the system monitor.  The step that only uses
> >>>> one thread, in spite of the parameter -threads. What step? And why?
> >>>> 
> >>>> I see some interesting files are created, with names like
> >>>> run8.best100.out . I suppose those are the most successful translations.
> >>>> How are they used in the tuning?
> >>>> 
> >>>> The default tuner (?) is mert, how does mert acually work to do the
> >>>> tuning efficient?  How are the weights to be tested chosen? Are there
> >>>> any short cuts to take?
> >>>> What's the difference to other tuners (?)?
> >>>> 
> >>>> Anyone working on some different approach for tuning, to get improved
> >>>> tuning speed or improved translation quality?
> >>>> 
> >>>> What's the recommended size of the tuning corpus? Is that size
> >>>> independent of the size of the training corpus? Is it dependent of the
> >>>> tuner (?) used?
> >>>> 
> >>>> Yours,
> >>>> Per Tunedal
> >>>> 
> >>>> PS My tuning has just started round 8, after 20 hours of processing.
> >>>> Will it stop at 10 rounds, or what?
> >>>> 
> >>>> 
> >>>> _______________________________________________
> >>>> Moses-support mailing list
> >>>> [email protected]
> >>>> http://mailman.mit.edu/mailman/listinfo/moses-support
> >>>> 
> >> _______________________________________________
> >> Moses-support mailing list
> >> [email protected]
> >> http://mailman.mit.edu/mailman/listinfo/moses-support
> >> 
> > 
> > _______________________________________________
> > Moses-support mailing list
> > [email protected]
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> 
> 
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to