[Tika Wiki] Update of "TikaOCR" by TimothyAllison

Apache Wiki Thu, 23 Mar 2017 05:15:41 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Tika Wiki" for change 
notification.


The "TikaOCR" page has been changed by TimothyAllison:
https://wiki.apache.org/tika/TikaOCR?action=diff&rev1=11&rev2=12

  With [[https://issues.apache.org/jira/browse/TIKA-93|TIKA-93]] you can now 
use the awesome Tesseract OCR parser within Tika!
  
- First some instructions on getting it installed.
+ First some instructions on getting it installed. See Tesseract's 
[[https://github.com/tesseract-ocr/tesseract/wiki|readme]].
  
  = Mac Installation Instructions =
  
@@ -27, +27 @@

   2. uninstall leptonica `brew uninstall leptonica`
   3. install leptonica with tiff support `brew install leptonica 
--with-libtiff`
   4. install tesseract `brew install tesseract --all-languages 
--with-serial-num-pack`
+ 
+ = Installing Tesseract on RHEL =
+  1. Add "epel" to your yum repositories if it isn't already installed
+ 
+  1a. `wget 
https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm` (or 
appropriate version)
+ 
+  1b. `rpm -Uvh epel-release-latest-7.noarch.rpm`
+ 
+  2. `yum install tesseract`
+  3. To add language packs, see what's available `yum search tesseract` then, 
e.g. `yum install tesseract-langpack-ara`
+ 
+ = Installing Tesseract on Windows =
+ See [[https://github.com/UB-Mannheim/tesseract/wiki|UB-Mannheim]].
  
  = Using Tika and Tesseract =

[Tika Wiki] Update of "TikaOCR" by TimothyAllison

Reply via email to