Dear Wiki user, You have subscribed to a wiki page or wiki category on "Tika Wiki" for change notification.
The "TikaOCR" page has been changed by DaveMeikle: https://wiki.apache.org/tika/TikaOCR?action=diff&rev1=5&rev2=6 Comment: Added information on new X-Tika-OCRLanguage header for Tika Server Once you have Tesseract and a fresh build of Tika 1.7-SNAPSHOT (including Tika server), you can easily use Tika-Server with Tesseract. For example, to post a TIFF file to the server and get back its OCR extracted text, run the following commands: - == in another window, start Tika server == + === in another window, start Tika server === `java -jar /path/to/tika-server-1.7-SNAPSHOT.jar` - == in another window, issue a cURL request == + === in another window, issue a cURL request === `curl -T /path/to/tiff/image.tiff http://localhost:9998/tika --header "Content-type: image/tiff"` + + == Overriding the configured language as part of your request == + + Different requests may need processing using different language models. These can be specified for specific requests using the ''X-Tika-OCRLanguage'' custom header. An example of this is shown below: + + `curl -T /path/to/tiff/image.jpg http://localhost:9998/tika --header "X-Tika-OCRLanguage: eng"` + + Or for multiple languages: + + `curl -T /path/to/tiff/image.jpg http://localhost:9998/tika --header "X-Tika-OCRLanguage: eng+fra"` = Overriding Default Configuration =
