In order to get the TesseractOCRParserTest to run, having installed Tesseract 
on OSX using “brew install tesseract”, I had to be explicit about the paths.

Any thoughts on how we could convey to a user that they might need to tweak the 
path to run the unit tests?  I was thinking about adding some sort of 
messaging, but I don’t know if that is a pattern that we have in Tika with 
these external dependencies?

Thoughts?

diff --git 
a/tika-parsers/src/test/java/org/apache/tika/parser/ocr/TesseractOCRParserTest.java
 
b/tika-parsers/src/test/java/org/apache/tika/parser/ocr/TesseractOCRParserTest.java
index 9ebcee068..32db2c442 100644
--- 
a/tika-parsers/src/test/java/org/apache/tika/parser/ocr/TesseractOCRParserTest.java
+++ 
b/tika-parsers/src/test/java/org/apache/tika/parser/ocr/TesseractOCRParserTest.java
@@ -51,6 +51,7 @@ public class TesseractOCRParserTest extends TikaTest {
 
     public static boolean canRun() {
         TesseractOCRConfig config = new TesseractOCRConfig();
+        config.setTesseractPath("/usr/local/bin");
         TesseractOCRParserTest tesseractOCRTest = new TesseractOCRParserTest();
         return tesseractOCRTest.canRun(config);
     }
@@ -164,6 +165,8 @@ public class TesseractOCRParserTest extends TikaTest {
                           BasicContentHandlerFactory.HANDLER_TYPE handlerType,
                           TesseractOCRConfig.OUTPUT_TYPE outputType) throws 
Exception {
         TesseractOCRConfig config = new TesseractOCRConfig();
+        config.setTesseractPath("/usr/local/bin");
+        
config.setTessdataPath("/usr/local/Cellar/tesseract/4.1.0/share/tessdata");
         config.setOutputType(outputType);
         
         Parser parser = new RecursiveParserWrapper(new AutoDetectParser(),
_______________________
Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | 
http://www.opensourceconnections.com <http://www.opensourceconnections.com/> | 
My Free/Busy <http://tinyurl.com/eric-cal>  
Co-Author: Apache Solr Enterprise Search Server, 3rd Ed 
<https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
    
This e-mail and all contents, including attachments, is considered to be 
Company Confidential unless explicitly stated otherwise, regardless of whether 
attachments are marked as such.

Reply via email to