Recently a survey of online machine translation services which translate
between English and the official languages of the EU was conducted:
http://www.morphologic.hu/public/tl/EN-EU_WebMT.htm *
 
Summary:
There are English translators for 17 languages (34 language pairs). The
number of official languages of EU today is 23. We have 5 missing languages:
Estonian, Irish, Lithuanian, Maltese and Slovak. English-Slovak MT exists,
but still has no online version. English-Lithuanian is under preparation in
cooperation with PROMT. English is official language in Ireland and Malta.
So very soon we will have only one country (Estonia) without a translator
between English and one of its official languages. Then the coverage in the
numbers of population will reach 99.73%.
 
There are 18 suppliers: Amebis, D'Agostini, IBM, Institute of Language and
Communication, Kielikone, Linguatec, LocalTranslation, LogoVista,
MorphoLogic, Poleng, ProLangs, PROMT, SkyCode, Sunda, Systran, Tranexp,
Translendium, and Trident.
 
The broad coverage and the manageable number of participants gave the
following idea:
 
PROPOSAL:
 
A common interface for MT servers should be defined. European companies,
institutions and organizations with translation needs between all of these
languages could use this common interface. The details of this API could be
discussed in this forum. I suggest the name EAMT API to be used. Later the
API could be published at EAMT web site together with the list of services,
which can be reached via this API.
 
The connection between the MT services and the licensors would remain
direct; the URL of the service would be a parameter of the API. Also,
subscription methods for the services could remain unchanged and managed
directly between the service providers and the subscribers.
 
This API would serve only as a guideline for web translators. Since the
connection would remain direct, additions of extra features would also
remain possible.
 
The main goal of the common API is to declare an association of web
translators, which together cover nearly all of the languages of Europe. The
list of MT providers might also initiate or enliven the development of
missing language pairs.
 
 Some details, to start the debate:
 
The proposed API would look like the following: the caller addresses the URL
of the MT service, and sends the following parameters: language pair, text
to be translated, format, encoding, domain, and a code that identifies the
requestor. The reply would be the translation. The identification code could
be used both for time and traffic based services. We have to exclude free
services to be uniform. Free trial services can be (and usually, they are)
provided at the provider's own site to gain popularity and traffic. The
published list of MT providers aims to reach big customers who need complete
solutions to all of the languages. There would be no need even for a common
test site; the list would contain only references to websites where the
tests could be done.
 
Possible customers: first of all, the European Union itself that uses
EC-Systran, which currently covers 10 language pairs (from English to: NL,
FR, DE, EL, IT, ES, PT and from English to: FR, DE, ES). With the use of the
EAMT API, the list of available language pairs could be extended by 26.
 
This idea was also initiated by the growing challenge of statistical
MT-systems that promise full coverage of language pairs between European
languages. The services in the above survey are basically rule based, but
the API is naturally open for statistical systems, too.
 
No comment on quality is provided even tough we have strong opinions on the
output of the different systems. I ignored quality info since it is
constantly changing both by time, domain and evaluation method. The average
number of solutions for an existing language pair is 3.1, so subscribers
have an opportunity to make their own choices.
 
Possible extensions of the list: adding more pivot languages (e.g. French),
adding minority languages (e.g. Catalan), adding European but not member
state languages (e.g. Russian) or adding every language of the world.
 
The idea needs support from a group of members and we also will need support
from EAMT to allow the publication of the API and the list of systems that
implement it. 
 
Waiting for the your reflections,
Mr. Laszlo Tihanyi
MorphoLogic
Head of MT Department
 
* The list is based on thirteenth edition of the Compendium (by John
Hutchins) and was extended by my private research. The Compendium lists 231
language pairs that have MT systems. I studied the 44 English-EU language
pairs only, of which 34 have MT systems. For this 34 language pairs
Compendium lists 480 solutions. My list contains only 137, as I ignored some
systems for the following reasons: there was no online version, there were
duplicate or alternative versions, developer couldn't be identified, there
was no English information at the site, or it was a different dialect.


_______________________________________________
Mt-list mailing list

Reply via email to