Hello! Tika is awesome! I can build with no problems if I skip the tests. However, OCR'ing inline images fails, for example embedded in PDFs. OCR'ing images as such works, just not embedded ones. I have the same issue with the GUI app (PDFs are ok and images are ok, but PDFs with image not ok). Same also happens with my application.
Is there a trick to make it work? The Unit tests for inline images also all fail, so I am assuming there is some config issue. I have set tesseractPath and tessdataPath and the path to magick.exe in the properties file in the Maven project (tika 1.7) in case it needs those paths... tesseractPath="C:/Program Files (x86)/Tesseract-OCR" tessdataPath="C:/Program Files (x86)/Tesseract-OCR/tessdata" ImageMagickPath="C:/Program Files (x86)/ImageMagick" Is there anything specific I need to configure to make inline OCR work? Is this maybe a windows thing? Best, Ulrich ObjectSecurity