Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Tika Wiki" for change 
notification.

The "TikaOCR" page has been changed by TimothyAllison:
https://wiki.apache.org/tika/TikaOCR?action=diff&rev1=8&rev2=9

  
  `java -cp /path/to/your/classpath:/path/to/tika-server-1.7-SNAPSHOT.jar 
org.apache.tika.server.TikaServerCli`
  
+ = OCR and PDFs =
+ 
+ See 
[[https://wiki.apache.org/tika/PDFParser%20%28Apache%20PDFBox%29|PDFParser 
notes]].
+ 
  = Disable Tika OCR =
  Tika's OCR will trigger on images embedded within, say, office documents in 
addition to images you upload directly. Because OCR slows down Tika, you might 
want to disable it if you don't need the results. You can disable OCR by simply 
uninstalling tesseract, but if that's not an option, here is a tika.xml config 
file that disables OCR:
  {{{

Reply via email to