Thanks for the answers, Miles.

You mentioned the variant by Giza++ is OK. In that case, is that OK if  I
set the seeds for mert and mkcls so the  only variant is Giza++? Otherwise,
the results will differ too much.

> >
> >>
> >> --to reduce variance it is best to use as much data as possible at
> >> each stage.  (100 sentences for tuning is far too low;  you should be
> >> using at least 1000 sentences).  it is possible to reduce this
> >> variability by using better machine learning, but in general it will
> >> always be there.
> >>
> > What do you mean by better machine learning? Isn't the 500,000 words
> corpus
> > enough? For the 1,000 sentences for tuning, can I use the same sentences
> as
> > used in the training or they shall be separate sets of sentences?
>
> lattice MERT is an example, or the Berkeley Aligner.
>
> Thanks for the pointers.


> you cannot use the same sentences for training and tuning, as has been
> explained earlier on the list
>
>
> What list? Oh, is that OK if the tuning is not from the same domain/source
as the training data?

if you have some problem that is fairly stable (uses the same training
> set, language models etc) then after running MERT many times and
> evaluating it on a disjoint test set, you pick the weights that
> produce good results.  afterwards you do not re-run MERT even if you
> have changed the model.
>

Oh, do you mean the same training data but different sets of test data?


>
> as i mentioned, this is ugly and something you do not want to do
> unless you are forced to do it
>
> Yes, I can imagine so. Sorry, I am quite new in this field and my
previous  specialisation was not this.

Cheers,

Jelita
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to