Thanks for the answers, Miles. You mentioned the variant by Giza++ is OK. In that case, is that OK if I set the seeds for mert and mkcls so the only variant is Giza++? Otherwise, the results will differ too much.
> > > >> > >> --to reduce variance it is best to use as much data as possible at > >> each stage. (100 sentences for tuning is far too low; you should be > >> using at least 1000 sentences). it is possible to reduce this > >> variability by using better machine learning, but in general it will > >> always be there. > >> > > What do you mean by better machine learning? Isn't the 500,000 words > corpus > > enough? For the 1,000 sentences for tuning, can I use the same sentences > as > > used in the training or they shall be separate sets of sentences? > > lattice MERT is an example, or the Berkeley Aligner. > > Thanks for the pointers. > you cannot use the same sentences for training and tuning, as has been > explained earlier on the list > > > What list? Oh, is that OK if the tuning is not from the same domain/source as the training data? if you have some problem that is fairly stable (uses the same training > set, language models etc) then after running MERT many times and > evaluating it on a disjoint test set, you pick the weights that > produce good results. afterwards you do not re-run MERT even if you > have changed the model. > Oh, do you mean the same training data but different sets of test data? > > as i mentioned, this is ugly and something you do not want to do > unless you are forced to do it > > Yes, I can imagine so. Sorry, I am quite new in this field and my previous specialisation was not this. Cheers, Jelita
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
