Hi all,

As Philipp has correctly pointed out, the global score of SMT systems
cannot be used as a quantification of the translation quality, as it
is highly influenced by many factors like the sentence length,
language model score, etc., and therefore serves only to compare
different translation candidates for a given source sentence.

In my experience with trying to estimate the quality of translations
at the sentence level, I've noticed that a couple of internal features
of a standard PBSMT system (not Moses, but very similar) correlate
reasonably with the quality of the sentences as judged by humans or
automatic metrics like BLEU, but none of these features alone is good
enough. Not surprisingly, the language-model score was one of the most
useful features.

I've discussed the relevance of standard SMT and other features in the
following papers:

http://clg.wlv.ac.uk/papers/Specia_EAMT2009.pdf
http://clg.wlv.ac.uk/papers/Specia_MTSummit2009.pdf

The main dataset used in both experiments, which contain translations
produced by 4 SMT systems for 4K sentences and their human judgements
is available under request (soon to be available through the LREC
resources website), so let me know  if you want to play with it.

Best,

Lucia Specia




> Date: Wed, 17 Feb 2010 13:48:15 +0000
> From: Philipp Koehn <[email protected]>
> Subject: Re: [Moses-support] sentence score and confidence indicator
> To: Francois Masselot <[email protected]>
> Cc: "[email protected]" <[email protected]>
> Message-ID:
>        <[email protected]>
> Content-Type: text/plain; charset=windows-1252
>
> Hi Francois,
>
> thanks for raising this interesting problem.
>
> The translation probability that Moses provides is unlikely to be a
> good indicator
> of the quality of the translation, since it will be dominated by the
> language model
> component score. In other words, it is more an indicator of how many unusual
> words are in the translation.
>
> Maybe looking at the component scores may be more informative, you can get
> these currently only easily by outputting a n-best list (1-best is
> enough). Other
> researchers have worked with posterior probabilities.
>
> See also the following publications:
>
> Quirk: Training a Sentence-Level Machine Translation Confidence Measure
> http://research.microsoft.com/pubs/68968/conf_lrec2004.pdf
>
> Specia et al.: Sentence-Level Confidence. Estimation for MT.
> http://www.mt-archive.info/SMART-2009-Specia.pdf
>
> Raybaud et al.: Word- and sentence-level confidence measures for
> machine translation
> http://www.mt-archive.info/EAMT-2009-Raybaud.pdf
>
> As I mentioned to you at the MT Marathon, it would be great to make this type
> of data public, so more researchers could work on it.
>
> -phi
>
> On Tue, Feb 16, 2010 at 9:23 AM, Francois Masselot
> <[email protected]> wrote:
>> Hello,
>>
>>
>>
>> Has anyone tried to correlate the sentence score returned by Moses (ex:
>> [total=-15.451]) with the post-editing effort?
>>
>>
>>
>> Anything that I have tried so far is a dead end: the Moses score correlates
>> with none of: edit time, edit distance, keystrokes, as recorded on actual
>> post-editing work performed by professional translators. The only thing that
>> shows some sort of linear correlation, although badly (R^2 = 0.5), is the
>> sentence length, but that we know (the longer the sentence, the lower the
>> score).
>>
>>
>>
>> Using the sentence score as a confidence indicator is a tempting idea, but
>> if it does not correlate with anything in relation with the ?value? of the
>> machine-translated sentence (in a context where m-translations are reviewed
>> by human post-editors), I wonder if this indication can be useful at all.
>>
>>
>>
>> Did someone already looked at those things?
>>
>> What do you guys think?
>>
>>
>>
>>
>>
>> Fran?ois Masselot
>>
>> Global Content - Machine Translation
>>
>>
>>
>> CMS & Language Technologies
>>
>> Localization Services
>>
>> Autodesk Neuchatel
>>
>> +41 32 723 94 60
>>
>>
>>

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to