Hi,

do you have any quantitative results on using post-edited texts vs.
parallel corpora, in terms of quality of the goodness measure?

-phi

On Sat, Sep 17, 2011 at 4:53 AM, Nguyen Bach <[email protected]> wrote:
> Hi Taylor and all,
>
> I am the first author of the "Goodness" paper and I would love to make
> everything open source.
> However, this work was done during my internship at IBM so everything
> belongs to IBM.
>
> In order to replicate the work to some degrees, I suggest you use NIST
> MT test sets and CRF++.
> Steps can be
> 1. Use your MT engine translate test sets.
> 2. Use a TER aligner, for example TERp, to align your MT output with
> translation references.
> 3. Words without TER errors can be label as *Good* and others with TER
> errors will be labeled *Bad*.
> 4. Use CRF++, or any other ML toolkit,  to train a binary classifier
> with the features in the paper.
> 5. Goodness score of a sentence can be computed by the sum of the
> marginal probability of *Good* labels normalize by sentence length.
>
> I hope this suggestion will be helpful for you.
>
> Cheers,
> Nguyen
>
> On 9/15/2011 1:52 PM, Barry Haddow wrote:
>> Hi Taylor
>>
>> If I remember rightly, this paper made use of about 20-30k post-edited
>> sentences which are unlikely to be released. So there is no way to replicate
>> this work.
>>
>> Confidence estimation is an active research area in MT, but I don't think 
>> that
>> there are any really good answers yet. Check out the last couple of years' 
>> ACL
>> and EMNLP, as well as WMT, to see what's going on
>> (http://www.aclweb.org/anthology-new/)
>>
>> cheers - Barry
>>
>> On Thursday 15 September 2011 18:26:22 Taylor Rose wrote:
>>> Hey all,
>>>
>>> I've been researching how to judge the quality of a machine translation.
>>> I found this article about judging the "goodness" of translations. This
>>> is *exactly* what I've been trying to do. Does anyone know if their are
>>> implementations of their algorithm available? It would take me a
>>> substantial amount of time to try and replicate their process and even
>>> then I do not have the corpus assets nor the processing power they had.
>>>
>>> Also, does anyone know of other existing systems that can accurately
>>> compute the quality of translation without the need of an immense server
>>> farm?
>>>
>>> Thanks,
>>>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to