Hi Jeremy,

Thanks for reaching out.

So far I have had really good experience with the Lingo24 translator. It really 
depends though
and is based on two families of what you are trying to do. For example, if you 
want the widest,
most broad coverage and trained translation, Google, Microsoft, Lingo24, fall 
into the remote
translation API service category. They all have tons of data, and training. I 
also think all use
human curators for quality review of some things. All will eventually cost you. 
I know that you
get some X million characters of translation a month in the services.

On the other end is if you deploy your own Apache Joshua (incubating) and/or 
Moses MT system,
and then have Tika connect to them as a service. In this case you control the 
costs and can run it
on your own servers, etc, but you are limited by the quality of your trained 
models, and your language
pairs.

Does this make sense?

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Principal Data Scientist, Engineering Administrative Office (3010)
Manager, NSF & Open Source Projects Formulation and Development Offices (8212)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 180-503E, Mailstop: 180-503
Email: [email protected]
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Director, Information Retrieval and Data Science Group (IRDS)
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
WWW: http://irds.usc.edu/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


From: "Merrill, Jeremy" <[email protected]>
Reply-To: "[email protected]" <[email protected]>
Date: Monday, March 20, 2017 at 8:30 AM
To: "[email protected]" <[email protected]>
Subject: machine translation recommendation for use with Tika?

Hi friends,

I've been tasked with figuring out how to machine-translate a large set of 
documents from a common European language into English, using a system that 
already utilizes Tika.

I know Tika integrates with a handful of machine-translation 
APIs<https://tika.apache.org/1.14/api/org/apache/tika/language/translate/package-summary.html>.
 Do you all have a sense of which works best, both in terms of translation 
quality and ease of integration with Tika?

(We know we're going to have to pay, but the amount of content won't be huge, 
so differences in price aren't a big factor.)

Thanks in advance,
Jeremy B. Merrill


Reply via email to