How to best train tesseract?

kristoffermilligan Fri, 24 May 2013 01:49:50 -0700

Hello,

I am looking to OCR single images that will contain three letters. 
Tesseract is good at detecting the letters (boxing them), but it is failing 
pretty hard when it comes to distinguishing what the actual letter is. I am 
therefore thinking that I need to train tesseract to see if I can improve 
the results. My question is as follows;


What is the best way to train tesseract to recognize these characters?

A) Should I generate lots of single images, fix their box files, and feed 
tesseract one by one?

B) Should I combine my training images to one large image and one box file 
and correct this?

C) Other?

Also, how many images would it require for a proper training set? Roughly ..

Looking forwards to your replies

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

How to best train tesseract?

Reply via email to