[
https://issues.apache.org/jira/browse/TIKA-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15607325#comment-15607325
]
Hudson commented on TIKA-1343:
------------------------------
UNSTABLE: Integrated in Jenkins build Tika-trunk #1126 (See
[https://builds.apache.org/job/Tika-trunk/1126/])
TIKA-1343 Create a Tika Translator implementation that uses (lewis.j.mcgibbney:
rev d4fb28f91d77458b15557942438f874b9f564e88)
* (edit)
tika-core/src/main/java/org/apache/tika/language/detect/LanguageResult.java
* (edit)
tika-translate/src/main/java/org/apache/tika/language/translate/AbstractTranslator.java
* (edit)
tika-translate/src/main/resources/org/apache/tika/language/translate/translator.yandex.properties
* (edit) tika-parsers/pom.xml
* (add)
tika-translate/src/main/java/org/apache/tika/language/translate/JoshuaNetworkTranslator.java
* (add)
tika-translate/src/main/resources/org/apache/tika/language/translate/translator.joshua.properties
* (edit)
tika-translate/src/main/resources/org/apache/tika/language/translate/translator.google.properties
* (edit)
tika-translate/src/main/java/org/apache/tika/language/translate/GoogleTranslator.java
* (edit)
tika-core/src/main/java/org/apache/tika/language/translate/Translator.java
* (edit)
tika-translate/src/main/resources/org/apache/tika/language/translate/translator.moses.properties
* (edit)
tika-translate/src/test/java/org/apache/tika/language/translate/MicrosoftTranslatorTest.java
* (add)
tika-translate/src/test/java/org/apache/tika/language/translate/JoshuaNetworkTranslatorTest.java
* (edit)
tika-translate/src/main/resources/org/apache/tika/language/translate/translator.lingo24.properties
* (edit)
tika-translate/src/main/java/org/apache/tika/language/translate/MosesTranslator.java
* (edit)
tika-translate/src/test/java/org/apache/tika/language/translate/YandexTranslatorTest.java
TIKA-1343 Create a Tika Translator implementation that uses (lewis.mcgibbney:
rev dadbf55c51d166846aa0d365fd2ed340b604bfae)
* (edit)
tika-translate/src/test/java/org/apache/tika/language/translate/JoshuaNetworkTranslatorTest.java
* (edit)
tika-translate/src/main/java/org/apache/tika/language/translate/JoshuaNetworkTranslator.java
* (edit)
tika-translate/src/main/resources/META-INF/services/org.apache.tika.language.translate.Translator
* (edit)
tika-translate/src/main/resources/org/apache/tika/language/translate/translator.joshua.properties
> Create a Tika Translator implementation that uses JoshuaDecoder
> ---------------------------------------------------------------
>
> Key: TIKA-1343
> URL: https://issues.apache.org/jira/browse/TIKA-1343
> Project: Tika
> Issue Type: New Feature
> Components: translation
> Reporter: Chris A. Mattmann
> Assignee: Lewis John McGibbney
> Fix For: 1.15
>
>
> The Joshua Decoder toolkit is a BSD licensed Java-based statistical machine
> translation system hosted at Github:
> http://joshua-decoder.org/
> Joshua takes in corpuses and trains models that can then be used to do
> language translation. Currently there is support for e.g., Spanisn->English,
> Indian dialects->English, Chinese->English, and a few others.
> https://github.com/joshua-decoder/joshua/
> It would be nice to build a Tika Translator on top of Joshua. There are of
> course several issues with this:
> * the models are huge - so we'll need a separate package or Maven module,
> maybe tika-translate-joshua or something to release the models and we'll need
> to build the models. I just went through the process of building the
> Spanish->English one, and it still needs to be rebuilt b/c I did it wrong,
> but it took over a day
> * there is a configuration for Joshua, and so we need some way of passing
> that config into the Translator. Not sure of the best way to do this.
> * Joshua isn't in the Central repository. I've started a discussion on the
> Joshua lists about this:
> https://groups.google.com/forum/#!topic/joshua_support/9Y04miboUj0
> Anyhoo, I've got a working patch right now with hard code stuff, and a manual
> install into my Maven repo for brave souls out there that want to try it.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)