Thanks - I will investigate further.  Initial test that I ran based on 
Tom's input showed around the same performance (I used a multi-page TIFF), 
however the article you referenced indicated a speedup factor of 2x.  

Is there a way to have Tesseract to process the pages in parallel ?

On Thursday, February 18, 2016 at 9:58:12 PM UTC-5, Quan Nguyen wrote:
>
> If you can reduce or minimize initializing and disposing of Tesseract 
> native instances for every run, you can achieve significant performance 
> increase.
>
> https://sourceforge.net/p/tess4j/discussion/1202294/thread/d32bd579/ 
>
> On Sunday, February 14, 2016 at 10:15:12 AM UTC-6, viraf wrote:
>>
>> I am new to tesseract and using it through Tess4J.  I am trying to OCR 
>> faxes where pages are represented as TIFF (CCITT T.6) images - 2509 x 3530 
>> @ 300 dpi (1 bit - i.e. BW).  
>>
>> I have two set of questions
>>
>> *Speed*
>> On an intel i7-4800 MQ @ 2.7GHz I am getting approximately 6 PPM using 1 
>> thread.  I was looking for suggestions on how to speed up page processing. 
>>  I use parallelStream to process each page in a separate thread,
>>
>>
>> - viraf
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/3035d0aa-1872-40da-9db5-acffe3c7e773%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to