Abha created TIKA-3518:
--------------------------

             Summary: Tika 1.26 not Working with Tesseract 4.0 and Higher 
Version
                 Key: TIKA-3518
                 URL: https://issues.apache.org/jira/browse/TIKA-3518
             Project: Tika
          Issue Type: Bug
          Components: ocr, tika-batch, tika-dl, tika-server
    Affects Versions: 1.26
            Reporter: Abha


ProcessBuilder not creating tmp file for Tesseract 4.1 and Higher Versions With 
Tika 1.26 and JDK 1.8

I am working on a project which integrates Tika and Tesseract OCR Tika Version 
is 1.26, JDK 1.8 Now for any Tesseract Version earlier than 4.0 works fine and 
extracts the image/pdf data correctly But upgrading the TesseractOCR to 4.1.1 
or Higher results in no data extraction. I debugged the issue and found that 
the ProcessBuilder is not creating the temporary txt output file from which 
TesseractOCR extracts the result, resulting in the issue. Any idea if this is a 
version compatibility issue Or How to resolve this?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to