----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22219/#review44758 -----------------------------------------------------------
trunk/tika-core/src/main/java/org/apache/tika/Tika.java <https://reviews.apache.org/r/22219/#comment79299> You don't need this - you can do service loading just like we do with the Parser class? Can you check that out? We should just need the interface. trunk/tika-core/src/main/java/org/apache/tika/Tika.java <https://reviews.apache.org/r/22219/#comment79301> should be dynamically loaded via JavaSPI trunk/tika-core/src/main/java/org/apache/tika/language/MicrosoftTranslator.java <https://reviews.apache.org/r/22219/#comment79303> use Eclipse or IdeaJ to auto put javadoc in for interfaces? - Chris Mattmann On June 4, 2014, 7:17 p.m., Tyler Palsulich wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/22219/ > ----------------------------------------------------------- > > (Updated June 4, 2014, 7:17 p.m.) > > > Review request for tika and Chris Mattmann. > > > Repository: tika > > > Description > ------- > > This patch adds basic language translation functionality to Tika. Translation > is provided by a Microsoft API, but accessed through Apache 2 licensed > com.memetix.microsoft-translator-java-api > (https://code.google.com/p/microsoft-translator-java-api/ ). If a user wants > to use the translation feature, they have to add a client id and client > secret to the > tika-core/src/main/resources/org/apache/tika/language/translator.properties > file (see http://msdn.microsoft.com/en-us/library/hh454950.aspx ). I added > com.memetix as a dependency in tika-core. I put the Translator class in > org.apache.tika.language. There is no integration with the server or CLI, > yet. Further, only Strings are translated right now -- if you pass in a full > document with xml tags, the structure will be mangled. But, I think that > would be a cool feature -- translate the body, title, subtitle, etc, but not > the structural elements. > > There is still more work to do, but I wanted some more eyes on this to make > sure I'm heading in the right direction and this is a desired feature. Let me > know what you think! > > > Diffs > ----- > > trunk/tika-core/pom.xml 1600418 > trunk/tika-core/src/main/java/org/apache/tika/Tika.java 1600418 > > trunk/tika-core/src/main/java/org/apache/tika/language/MicrosoftTranslator.java > PRE-CREATION > trunk/tika-core/src/main/java/org/apache/tika/language/Translator.java > PRE-CREATION > > trunk/tika-core/src/main/resources/org/apache/tika/language/translator.microsoft.properties > PRE-CREATION > > trunk/tika-core/src/test/java/org/apache/tika/language/MicrosoftTranslatorTest.java > PRE-CREATION > > Diff: https://reviews.apache.org/r/22219/diff/ > > > Testing > ------- > > There are two simple unit tests for now which translate "hello" to French > ("salut"). One for inputting the source and target languages, one for > inputing just the target language (and detecting the source language > automatically). > > > Thanks, > > Tyler Palsulich > >
