Hello,

> On Thu, 30 Oct 2014, Milos Kovacevic wrote:
>> I am using tika-server-1.7-SNAPSHOT.jar which incorporates tesseract ocr
>> engine. I am curious how can i set different tesseract parameters such
>> as
>> default language or output format (hOCR) in a separate request to tika
>> server?
>
> I believe they can only be set once on a server-wide basis at the moment

How can i do that?

>
> Could you explain a use case for wanting to change it on a per-request
> basis, to help us understand?

Well, I have a lot of files written in different languages and alphabets.
OCR performance depends on that info. So when I have to send let's say
English file I'll set the language to eng and if the file is Serbian I'll
set it to be SER. Tesseract uses language files to improve recognition
performance.

>
> Thanks
> Nick
>

Regards, Milos

Reply via email to