If you can reduce or minimize initializing and disposing of Tesseract native instances for every run, you can achieve significant performance increase.
https://sourceforge.net/p/tess4j/discussion/1202294/thread/d32bd579/ On Sunday, February 14, 2016 at 10:15:12 AM UTC-6, viraf wrote: > > I am new to tesseract and using it through Tess4J. I am trying to OCR > faxes where pages are represented as TIFF (CCITT T.6) images - 2509 x 3530 > @ 300 dpi (1 bit - i.e. BW). > > I have two set of questions > > *Speed* > On an intel i7-4800 MQ @ 2.7GHz I am getting approximately 6 PPM using 1 > thread. I was looking for suggestions on how to speed up page processing. > I use parallelStream to process each page in a separate thread, > > > - viraf > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/35c0f2d7-d4c4-4d5e-b1fb-ad0aefff9728%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

