Title: Clarification: MT Italian > English, Romanian<-> Italian
Dear all,                                       29/3/05

At 11:28 +0000 29/03/05, Jeff Allen wrote:
Dear Natalia, Hermann, Christian, and all,

The following short article shows that Full Postediting for high-quality professional translation work is very much possible.

ALLEN, Jeff. What is Post-editing?  Translation Automation Newsletter, Issue 4. February 2005. Published by Cross-Language.
http://www.geocities.com/mtpostediting/TA_IssueFour.pdf

I agree 100% with his article. I am even more optimistic that Jeff (and others, read it!).

Hence, I feel I must have been unclear in my preceding e-mail. Let me clarify my point.

Clarification
-------------
The negative part of what I said is that, WITHOUT UNDERSTANDING THE SOURCE LANGUAGE, or equivalently, if you understand it, WITHOUT ACCESS TO THE SOURCE TEXT, or with VERY LIMITED ACCESS (say, 1 sentence per page, as when one postedits a quality human translation draft), it is not possible.

That is because too many errors have causes one can't possibly understand.

I have tested that for 3 years with engineer students. Very often, they don't even GUESS that something is completely wrong with a translation when they don't understand the source language.

That happens even if the MT system is set to produce more than one translation in case of lexical ambiguity (2 or 3, that is the default for Prompt/Reverso on the web, contrary to Systran, but Systran Pro also has that parameter available). that is because:
  • in case of lexical ambiguity, there may be a lot more possible translations than 2 or 3
  • many lexical errors come from unrecognized idiomatic usages (e.g., expressions with support verbs, like "conna�tre un d�veloppement" --> "to know a development")
  • many other errors come from structural (attachment, scope) or functional (syntactic functions, semantic roles) ambiguities
  • and a lot of other errors come from the unability of the parser to produce a full, legitimate parse.

Now the positive part (in total agreement with Jeff's view).

The SAME MT system can be useless to produce a good quality translation if you don't know the source language (situation above), or can be a great help to that effect if you know the source language (situation below).

An experiment at ATR
--------------------
I have recently measured my translation time  (English > French) using simply Excel to store the original, the translation, and the MT proposal (Systran Pro with adequate parameter settings) side by side, one line per "polyphrase", with the translation initially containing the MT proposal, on 510 sentences of the BTEC corpus (about 12 pages of 250 words).
 
Result: 12-13 mn per page of 250 words using MT
------  59 mn per page if not using it (average performance 3 other French natives also using Excel, 2 students and a senior researcher).

==> Time was divided by almost 5 when using MT.
==> That is better than Jeff's estimate, probably because the average length of sentences in this corpus is slighly more than 6 (words).

Remarks:
-------
1) when comparing the 4 translations side by side, all were very good. 3 were very similar, and the one which stuck out was not mine, done with MT, but that of a student who used the "TU" form (as in Canada) rather than the "VOUS" form in travel dialogues.

2) I used "as is" 25% of the sentences (= I edited or fully retyped 75% of them).

3) I confirmed that output rate on 3000 more sentences (without asking others to produce other reference translations).

There are at least 4 reasons for this usefulness:
  • reading first the source, you don't loose time (and your temper) futilely trying to understand a bad output.
  • having 25% perfectly good translations already reduces the time by 25%
  • in most cases, a very bad MT output contains usable translations of words or terms.
  • last but quite important: you can perform GLOBAL corrections, either downwards or on the whole translation.

For example, Systran would translate "please" by "s.v.p.", which is not acceptable if the output is meant to be a transcription of a spoken utterance. "please" occurs maybe in 20-30% of the sentences. With 510 sentences, 1 such global change saves 10-15 local changes. In the whole BTEC (163000 sentences), it can save up to 8000 changes.

I hope this clarification has been useful.

Best regards,

Ch.Boitet

Regards,

Jeff

Jeff Allen
Paris, France
[EMAIL PROTECTED] OR [EMAIL PROTECTED]


------------------
From: Christian Boitet <[EMAIL PROTECTED]>
To: "Hermann Plustwik" <[EMAIL PROTECTED]>,        "Natalia Elita" <[EMAIL PROTECTED]>, <[email protected]>
Subject: [Mt-list] mt Italian > English : ask Google, try/buy Systran
Date: Tue, 29 Mar 2005 10:27:55 +0200

Hi,              29/3/05

At 9:53 +1000 29/03/05, Hermann Plustwik wrote:
Hi,
Sorry, to take up your time, but Natalia's query encourages me to ask a question.
Can anyone point me to a worthwhile and working mt system for Italian > English?
Just general English, but preferably with dictionary editing facility or 'user dictionary'.
Thank you in advance, your help is much appreciated.
Hermann Plustwik, Melbourne Australia.

Just ask Google "MT system Italian-English" to see what exists and then go to the Systran web site and buy the Pro version to be able to edit user dictionaries.

English > Italian is quite at the level of English > French.

I show a trial of Italian > English below. I think it is quite usable:
- to understand the gist if you don't know Italian
- to produce a good translation quicker if you first read the Italian and then use the output as a help.

LanguageWeaver claims to do all sorts of language pairs by statistical methods, but I did not find this one, nor any site where to experiment the existing ones. Probable reason for not having a demo site: to produce Systran-level translations, they have to align and process a very large translation memory (in the order of 50M words, or 200000 standard translator's pages, as said by K.Knight at CICLING-05). However, when that is available, the results are quite impressive!

About Transcend & others, I had no time to check, please try.

======================================
from http://www.peyrot.it/website-translation-localization/traduzione-gratuita-siti.htm
It clearly shows that:
- if you know Italian,
   . you can produce a good English translation with that as "suggestion" or "help" quite faster than without.
   . if you enrich the user dictionary, you may quickly fix certain mistakes (e.g. siti -> sites)
- if you don't know Italian, you can understand the overall meaning, but postediting into a professional quality level is not possible.

Finally, whether you know Italian or not, you might improve the "rough translation" output by using the interactive disambiguation facility introduced in version 5.
_______________________________________________
Mt-list mailing list


_______________________________________________
Mt-list mailing list

Reply via email to