Hi, do you have any quantitative results on using post-edited texts vs. parallel corpora, in terms of quality of the goodness measure?
-phi On Sat, Sep 17, 2011 at 4:53 AM, Nguyen Bach <[email protected]> wrote: > Hi Taylor and all, > > I am the first author of the "Goodness" paper and I would love to make > everything open source. > However, this work was done during my internship at IBM so everything > belongs to IBM. > > In order to replicate the work to some degrees, I suggest you use NIST > MT test sets and CRF++. > Steps can be > 1. Use your MT engine translate test sets. > 2. Use a TER aligner, for example TERp, to align your MT output with > translation references. > 3. Words without TER errors can be label as *Good* and others with TER > errors will be labeled *Bad*. > 4. Use CRF++, or any other ML toolkit, to train a binary classifier > with the features in the paper. > 5. Goodness score of a sentence can be computed by the sum of the > marginal probability of *Good* labels normalize by sentence length. > > I hope this suggestion will be helpful for you. > > Cheers, > Nguyen > > On 9/15/2011 1:52 PM, Barry Haddow wrote: >> Hi Taylor >> >> If I remember rightly, this paper made use of about 20-30k post-edited >> sentences which are unlikely to be released. So there is no way to replicate >> this work. >> >> Confidence estimation is an active research area in MT, but I don't think >> that >> there are any really good answers yet. Check out the last couple of years' >> ACL >> and EMNLP, as well as WMT, to see what's going on >> (http://www.aclweb.org/anthology-new/) >> >> cheers - Barry >> >> On Thursday 15 September 2011 18:26:22 Taylor Rose wrote: >>> Hey all, >>> >>> I've been researching how to judge the quality of a machine translation. >>> I found this article about judging the "goodness" of translations. This >>> is *exactly* what I've been trying to do. Does anyone know if their are >>> implementations of their algorithm available? It would take me a >>> substantial amount of time to try and replicate their process and even >>> then I do not have the corpus assets nor the processing power they had. >>> >>> Also, does anyone know of other existing systems that can accurately >>> compute the quality of translation without the need of an immense server >>> farm? >>> >>> Thanks, >>> > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
