That would be great!

On Thursday, January 24, 2013, Germán Sanchis Trilles wrote:

> Hi all,
>
> personally I have an implementation of Koehn's 2004 ACL paper about
> statistical sifgnificance tests for MT evaluation. It implements both
> "stand-alone confidence intervals" (sec.5, bootstrap resampling) and paired
> bootstrap resampling, if a baseline is given. Right now, it computes
> confidence intervals for both TER and BLEU (including brev. penalty) using
> modified versions of multi-bleu.perl and tercom.jar which are packaged into
> the script itself, so that the resampling is performed on the TER and BLEU
> counts (instead of the sentences, which is extremely costly). I have been
> using it for some years now, so that it should be relatively robust. It
> implements bootstrap resampling for a given set of translations, i.e., it
> does not take into account optimizer instability.
>
> If it is of any interest to the Moses project, I have no problem
> whatsoever donating it to the MT community ;)
>
> Cheers,
>
> Germán



-- 
When a place gets crowded enough to require ID's, social collapse is not
far away.  It is time to go elsewhere.  The best thing about space travel
is that it made it possible to go elsewhere.
                -- R.A. Heinlein, "Time Enough For Love"
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to