Dear Wiki user, You have subscribed to a wiki page or wiki category on "Tika Wiki" for change notification.
The "TikaOCR" page has been changed by ChrisMattmann: https://wiki.apache.org/tika/TikaOCR?action=diff&rev1=2&rev2=3 3. install leptonica with tiff support `brew install leptonica --with-libtiff` 4. install tesseract `brew install tesseract --all-languages` + = Using Tika and Tesseract = + + Once you have Tesseract installed, you should test it to make sure it's working. A nice command line test: + + `tesseract -psm 3 /path/to/tiff/file.tiff out.txt` + + You should see the output of the text extraction in out.txt. + + `cat out.txt` + + Look for the text extracted by Tesseract. + + Once you have confirmed Tesseract is working, then you can simply use the Tika-app, built with 1.7-SNAPSHOT or + later to use Tika OCR. For example, try that same file above with Tika: + + `tika -t /path/to/tiff/file.tiff` + + That's it! You should see the text extracted by Tesseract and flowed through Tika. +
