Thank you for your email. I must make one significant correction:

LogoVista is NOT the producer of the language pairs listed  at 
http://www.morphologic.hu/public/tl/EN-EU_WebMT.htm.
LEC (Language Engineering Company, LLC), www.lec.com, is the producer of all of 
the engines listed in your document.


Glenn A. Akers, Ph.D. 
CEO and President
Language Engineering Company, LLC
www.lec.com
135 Beaver Street - Waltham, Massachusetts 02452 USA
Tel: +1 781 642 8900            Fax: +1 781 642 8904
Mobile: +1 617 780 9777      Blackberry: +1 617 259 8994

-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tihanyi László
Sent: Monday, July 09, 2007 12:36 PM
To: [email protected]
Subject: [Mt-list] common API for online MT systems

Recently a survey of online machine translation services which translate 
between English and the official languages of the EU was conducted:
http://www.morphologic.hu/public/tl/EN-EU_WebMT.htm *
 
Summary:
There are English translators for 17 languages (34 language pairs). The number 
of official languages of EU today is 23. We have 5 missing languages:
Estonian, Irish, Lithuanian, Maltese and Slovak. English-Slovak MT exists, but 
still has no online version. English-Lithuanian is under preparation in 
cooperation with PROMT. English is official language in Ireland and Malta.
So very soon we will have only one country (Estonia) without a translator 
between English and one of its official languages. Then the coverage in the 
numbers of population will reach 99.73%.
 
There are 18 suppliers: Amebis, D'Agostini, IBM, Institute of Language and 
Communication, Kielikone, Linguatec, LocalTranslation, LogoVista, MorphoLogic, 
Poleng, ProLangs, PROMT, SkyCode, Sunda, Systran, Tranexp, Translendium, and 
Trident.
 
The broad coverage and the manageable number of participants gave the following 
idea:
 
PROPOSAL:
 
A common interface for MT servers should be defined. European companies, 
institutions and organizations with translation needs between all of these 
languages could use this common interface. The details of this API could be 
discussed in this forum. I suggest the name EAMT API to be used. Later the API 
could be published at EAMT web site together with the list of services, which 
can be reached via this API.
 
The connection between the MT services and the licensors would remain direct; 
the URL of the service would be a parameter of the API. Also, subscription 
methods for the services could remain unchanged and managed directly between 
the service providers and the subscribers.
 
This API would serve only as a guideline for web translators. Since the 
connection would remain direct, additions of extra features would also remain 
possible.
 
The main goal of the common API is to declare an association of web 
translators, which together cover nearly all of the languages of Europe. The 
list of MT providers might also initiate or enliven the development of missing 
language pairs.
 
 Some details, to start the debate:
 
The proposed API would look like the following: the caller addresses the URL of 
the MT service, and sends the following parameters: language pair, text to be 
translated, format, encoding, domain, and a code that identifies the requestor. 
The reply would be the translation. The identification code could be used both 
for time and traffic based services. We have to exclude free services to be 
uniform. Free trial services can be (and usually, they are) provided at the 
provider's own site to gain popularity and traffic. The published list of MT 
providers aims to reach big customers who need complete solutions to all of the 
languages. There would be no need even for a common test site; the list would 
contain only references to websites where the tests could be done.
 
Possible customers: first of all, the European Union itself that uses 
EC-Systran, which currently covers 10 language pairs (from English to: NL, FR, 
DE, EL, IT, ES, PT and from English to: FR, DE, ES). With the use of the EAMT 
API, the list of available language pairs could be extended by 26.
 
This idea was also initiated by the growing challenge of statistical MT-systems 
that promise full coverage of language pairs between European languages. The 
services in the above survey are basically rule based, but the API is naturally 
open for statistical systems, too.
 
No comment on quality is provided even tough we have strong opinions on the 
output of the different systems. I ignored quality info since it is constantly 
changing both by time, domain and evaluation method. The average number of 
solutions for an existing language pair is 3.1, so subscribers have an 
opportunity to make their own choices.
 
Possible extensions of the list: adding more pivot languages (e.g. French), 
adding minority languages (e.g. Catalan), adding European but not member state 
languages (e.g. Russian) or adding every language of the world.
 
The idea needs support from a group of members and we also will need support 
from EAMT to allow the publication of the API and the list of systems that 
implement it. 
 
Waiting for the your reflections,
Mr. Laszlo Tihanyi
MorphoLogic
Head of MT Department
 
* The list is based on thirteenth edition of the Compendium (by John
Hutchins) and was extended by my private research. The Compendium lists 231 
language pairs that have MT systems. I studied the 44 English-EU language pairs 
only, of which 34 have MT systems. For this 34 language pairs Compendium lists 
480 solutions. My list contains only 137, as I ignored some systems for the 
following reasons: there was no online version, there were duplicate or 
alternative versions, developer couldn't be identified, there was no English 
information at the site, or it was a different dialect.


_______________________________________________
Mt-list mailing list

_______________________________________________
Mt-list mailing list

Reply via email to